Well I have been struggling with this for many weeks, many answers from SO helped me through, but there was always something missing, apparently no one here has ever had problems with jbig2 encoded images.
In the bunch of PDF that I am to scan, images encoded in jbig2 are very popular.
As far as I understand there are many copy/scan machines that scan papers and transform them into PDF files full of jbig2 encoded images.
So after many days of tests decided to go for the answer proposed here by dkagedal long time ago.
Here is my step by step on linux: (if you have another OS I suggest to use a linux docker it's going to be much easier.)
First step:
apt-get install poppler-utils
Then I was able to run command line tool called pdfimages like this:
pdfimages -all myfile.pdf ./images_found/
With the above command you will be able to extract all the images contained in myfile.pdf and you will have them saved inside images_found (you have to create images_found before)
In the list you could find several types of images (depends on you pdf) like: png, jpg, tiff; all these are easily readable with any graphic tool.
Then you will have some files named like: -145.jb2e and -145.jb2g.
These 2 files contain ONE IMAGE encoded in jbig2 which is saved in 2 different files one for the header and one for the data
Again I have lost many days trying to find out how to convert those files into something readable and finally I came across this tool called jbig2dec
So first you need to install this magic tool:
apt-get install jbig2dec
then you can run:
jbig2dec -t png -145.jb2g -145.jb2e
You are going to finally be able to get all extracted images converted into something useful.
good luck!
file
utility I getNetpbm image data, size = 902 x 1523, rawbits, bitmap
it is far more useable but it seem that type png isn't emitted. I get -145.pbm. – Inroad