imagemagick reduce size of pdf
Asked Answered
S

1

19

I need to automatically reduce the size of some user uploaded pdfs so that they can be sent via email.

I have a little imagemagick oneliner that reduces the size for me:

convert -density 120 -quality 10 -compress jpeg original.pdf output.pdf

basically exports every page of the pdf in jpg, updates density and quality and repacks the pages in a new PDF.

this works perfectly, except that with this command sometimes the files end up bigger, and I need to rerun tweaking density and quality to get the lowest size where the text in the pdf documents is still readable.

I'm not sure how to automate it. I thought to use identify to get characteristics of the files (height width density... ) and do stuff like half the figures or sth similar. but I'm struggling to get this info about the files.

Any suggestions?

Thanks,

Silage answered 4/3, 2022 at 13:41 Comment(6)
I was thinking if I can extract size and density data, then I could try working with theseSilage
I believe the -compress jpg (and perhaps even the -quality 10) directives are dispensable.Zerk
you can use --compress LZW or --compress lzw, that is better for color pagesStetson
Note: The explicit -compress jpg directive is absolutely necessary to achieve the goal: Lossy jpeg -recompression with tweaked resolution and quality parameters! Without this parameter the image format of the processed image assets within the PDF will be changed to an uncompressed bitmap image and only the -density parameters has an effect. Downsampling takes place, but no lossy image compression. An inspection with pdfimages -list of the different variants proofed this.Altazimuth
See my proof in detail in the standalone answer below.Altazimuth
In place of -quality XX, try -define jpeg:extent={size}, where size is specified with a suffix. For example "400kb".Preciosa
A
7

Addendum: Parameter -compression jpeg must be explicitly submitted or else you will end up with uncompressed image assets in your PDF:

$ cd ~/Pictures/Scans/

$ pdfimages -list Test.pdf 
page   num  type   width height color comp bpc  enc interp  object ID x-ppi y-ppi size ratio
--------------------------------------------------------------------------------------------
   1     0 image    4961  7016  icc     3   8  jpeg   yes        5  0   600   600 5907K 5.8%

$ convert -density 150 -quality 60  Test.pdf Test-150-060.pdf 

$ pdfimages -list Test-150-060.pdf 
page   num  type   width height color comp bpc  enc interp  object ID x-ppi y-ppi size ratio
--------------------------------------------------------------------------------------------
   1     0 image    1240  1754  rgb     3   8  image  no         8  0   150   150 6397K 100%
   1     1 smask    1240  1754  gray    1   8  image  no         8  0   150   150 33.5K 1.6%

$ convert -density 150 -quality 60 -compress jpeg  Test.pdf Test-150-060-jpeg.pdf 

$ pdfimages -list Test-150-060-jpeg.pdf 
page   num  type   width height color comp bpc  enc interp  object ID x-ppi y-ppi size ratio
--------------------------------------------------------------------------------------------
   1     0 image    1240  1754  rgb     3   8  jpeg   no         8  0   150   150 42.5K 0.7%
   1     1 smask    1240  1754  gray    1   8  image  no         8  0   150   150 33.5K 1.6%
Altazimuth answered 18/12, 2023 at 22:29 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.