Using PDFBox 2.0.4 to extract pages as image, my result page contains multiple "black holes" as shown in the following screen :
This happen only for this PDF and few others : http://www.filedropper.com/selection_3
Here is a simple code (with JavaFX) to reproduce the problem (change the File path after downloading the PDF) :
public class PDFExtractionTest extends Application {
@Override
public void start(Stage primaryStage) throws Exception {
FileInputStream inputStream = new FileInputStream(new File("C:\\Users\\John\\Desktop\\selection.pdf"));
PDDocument document = PDDocument.load(inputStream);
PDFRenderer pdfRenderer = new PDFRenderer(document);
BufferedImage bufferedImage = pdfRenderer.renderImage(1);
Image fxImage = SwingFXUtils.toFXImage(bufferedImage, null);
BorderPane borderPane = new BorderPane();
ImageView imageView = new ImageView(fxImage);
borderPane.setCenter(imageView);
primaryStage.setScene(new Scene(borderPane, 1024, 768));
primaryStage.show();
}
public static void main(String[] args) throws FileNotFoundException {
launch(args);
}
}
Here are my dependencies :
- pdfbox 2.0.4
- jai-imageio-jpeg2000 1.3.0 (Prevent error : Cannot read JPEG2000 image: Java Advanced Imaging (JAI) Image I/O Tools are not installed)
- levigo-jbig2-imageio 1.6.5 (Prevent error : Cannot read JBIG2 image: jbig2-imageio is not installed)
In the logs I have this, but I don't know if it's the cause of the problem. How can I fix it ?
févr. 01, 2017 11:20:51 AM org.apache.pdfbox.pdmodel.font.PDSimpleFont toUnicode
AVERTISSEMENT: No Unicode mapping for .notdef (9) in font Times-Bold
févr. 01, 2017 11:20:51 AM org.apache.pdfbox.rendering.Type1Glyph2D getPathForCharacterCode
AVERTISSEMENT: No glyph for 9 (.notdef) in font Times-Bold
févr. 01, 2017 11:20:51 AM org.apache.pdfbox.pdmodel.font.PDSimpleFont toUnicode
AVERTISSEMENT: No Unicode mapping for .notdef (9) in font Helvetica
févr. 01, 2017 11:20:51 AM org.apache.pdfbox.rendering.Type1Glyph2D getPathForCharacterCode
AVERTISSEMENT: No glyph for 9 (.notdef) in font Helvetica
Did I miss something in the code or should I report a bug ?