Let's review the relationship between "docx" and "pictures":
As I understand it, *.docx
stores original pictures (pictures at the moment when you copy/paste them into Word). And every time when you use that picture, Word makes a "link" to original picture.
But if you make some changes to that picture (for example resize, crop or change color) Word remembers your changes, modifying the "link" (add some special tags). That's great, because you will never lose quality of your picture!
Let's get a picture from our *.docx file. To do that I use this code snippet:
XWPFDocument wordDoc = new XWPFDocument( pathToFile );
for (XWPFParagraph p : wordDoc.getParagraphs()) {
for (XWPFRun run : p.getRuns()) {
for (XWPFPicture pic : run.getEmbeddedPictures()) {
byte [] img = pic.getPictureData().getData()
File outputfile = new File ( pathToOutputFile );
BufferedImage image = ImageIO.read(new ByteArrayInputStream(img));
ImageIO.write(image , "png", outputfile);
}
}
}
But this way I get the original pictures from *.docx. If, for example, you cropped out a section from your picture and gave me the rest, then I always find the whole image in outputfile
. That's not good.
Does anyone know how to get the picture with all changes that someone made to it in Word?