I read data from database from which I generate HTML DOM. The data volume is huge so it cannot fit in memory at once, however it can be provided chunk-by-chunk.
I would like to transform resulting HTML into PDF using Flying Saucer:
import org.xhtmlrenderer.pdf.ITextRenderer;
import org.dom4j.DocumentFactory;
import org.dom4j.Element;
import org.dom4j.io.DOMWriter;
OutputStream bodyStream = outputMessage.getBody();
ITextRenderer renderer = new ITextRenderer();
DocumentFactory documentFactory = DocumentFactory.getInstance();
DOMWriter domWriter = new DOMWriter();
Element htmlNode = documentFactory.createElement("html");
Document htmlDocument = documentFactory.createDocument(htmlNode);
int currentLine = 1;
int currentPage = 1;
try {
while (currentLine <= numberOfLines) {
currentLine += loadDataToDOM(documentFactory, htmlNode, currentLine, CHUNK_SIZE);
renderer.setDocument(domWriter.write(htmlDocument), null);
renderer.layout();
if (currentPage == 1) {
// For the first page the PDF writer is created:
renderer.createPDF(bodyStream, false);
}
else {
// Other documents are appended to current PDF writer:
renderer.writeNextDocument(currentPage);
}
currentPage += renderer.getRootBox().getLayer().getPages().size();
}
// Finalise the PDF:
renderer.finishPDF();
}
catch (DocumentException e) {
throw new IOException(e);
}
catch (org.dom4j.DocumentException e) {
throw new IOException(e);
}
finally {
IOUtils.closeQuietly(bodyStream);
}
The problem with this approach is that the last page of chunk is not necessarily completely filled with data. Is there any solution to fill the space? For example I could think about the approach that will check that last page is not filed completely and then discard it (not write to PDF), also find out which data was rendered on that page and rewind the position in database (currentLine
in example). Would be nice if one can post a complete solution.
moveTo()
,lineTo()
,beginText()
)? Now I have 50 lines of code, easy to manage. HTML and CSS are familiar to everyone. Changing the layout or colors is no problem. Bruno, I have looked briefly your book "iText in action" (many thanks for it!) and already headers/footers magic on page 430 (chapter 14) is scaring. I would happily usecom.itextpdf.tool.xml.pipeline.html.HtmlPipeline
but it does not support basic CSS selectors, not saying about floating boxes. – Subcontraoctave