How to convert a PDF into JPG with command line in Linux? [closed]
Asked Answered
D

4

87

What are fast and reliable ways for converting a PDF into a (single) JPEG using the command line on Linux?

Deadeye answered 29/3, 2017 at 6:25 Comment(2)
If you build xpdf from sources it comes with little utilities for things like pdftotext, pdftojpeg, and podftohtml. They might be distributed with some Linux distros but they don't seem to be in this Debian I'm using.Endosmosis
Sorry, they're in poppler-utils. pdfdetach, pdffonts, pdfimages, pdfinfo, pdfseparate, pdfsig, pdftocairo, pdftohtml, pdftoppm, pdftops, pdftotext and pdfunite. Or build xpdf from sources, I'm pretty sure.Endosmosis
C
85

You can try ImageMagick's convert utility.

On Ubuntu, you can install it with this command:

$ sudo apt-get install imagemagick

Use convert like this:

$ convert input.pdf output.jpg
# For good quality use these parameters
$ convert -density 300 -quality 100 in.pdf out.jpg
Cinereous answered 29/3, 2017 at 6:31 Comment(9)
If you get an error like convert: not authorized 'filename.pdf' @ error/constitute.c/ReadImage/412., then you need to modify /etc/ImageMagick-6/policy.xml or just temporarily rename it.Semiannual
Unfortunately this doesn't work for me. See the pdftoppm answer for a more effective solution.Marleen
I needed this to get a high-quality JGP instead of a low resolution one: convert -density 300 -quality 100 in.pdf out.jpgBerky
@Semiannual what modifications are needed in policy.xml ?Bushmaster
@Guus : In my policy.xml, I need to comment out the line <policy domain="coder" rights="none" pattern="PDF" />. But I usually prefer just temporarily renaming the file when I come across this error.Semiannual
@Semiannual Seriously that file is a pain in my side I wish by default there were no such limitations that necessitated me having to dig around to figure out why it won't do what I ask so oftenSunbreak
what kind of monstrosity is that? Why is there a "policy" in the first place? What are these stupid rules protecting against?Gone
Imagemagic doesn't work on many distros and making it work on Linux causes security issues. Please stop recommending it.Auteur
don't use the -flatten, and -trim options on multipage printable pdf conversion -flatten - puts all pages over on a single image -trim - cuts the white margin up to the textPatrizia
G
129

For the life of me, over the last 5 years, I cannot get imagemagick to work consistently (if at all) for me, and I don't know why people continually recommend it again and again. I just googled how to convert a PDF to a JPEG today, found this answer, and tried convert, and it doesn't work at all for me:

Broken command (doesn't work for me):

# BROKEN cmd
$ convert in.pdf out.jpg
convert-im6.q16: not authorized `in.pdf' @ error/constitute.c/ReadImage/412.
convert-im6.q16: no images defined `out.jpg' @ error/convert.c/ConvertImageCommand/3258.

(Update 24 Feb. 2022: here is the fix for imagemagick so convert will work. See also my comment here, and my comments under this answer here. I still like pdftoppm, below, much better, however.)

Then, I remembered there was another tool I use and wrote about, so I googled "linux convert pdf to jpg Gabriel Staples", clicked the first hit, and scrolled down to my answer. Here's what works perfectly for me. This is the basic command format:

Good command--use this instead:

# GOOD cmd
pdftoppm -jpeg -r 300 input.pdf output

Note: on Linux Ubuntu, you may need to do sudo apt update && sudo apt install poppler-utils in order to install pdftoppm. Thanks, @Reynadan.

The -jpeg sets the output image format to JPG, -r 300 sets the output image resolution to 300 DPI, and the word output will be the prefix to all pages of images, which will be numbered and placed into your current directory you are working in. A better way, in my opinion, however, is to use mkdir -p images first to create an "images" directory, then set the output to images/pg so that all output images will be placed cleanly into the images dir you just created, with the file prefix pg in front of each of their numbers.

Therefore, here are my favorite commands:

  1. [Produces ~1MB-sized files per pg] Output in .jpg format at 300 DPI:

    mkdir -p images && pdftoppm -jpeg -r 300 mypdf.pdf images/pg
    
  2. [Produces ~2MB-sized files per pg] Output in .jpg format at highest quality (least compression) and still at 300 DPI:

    mkdir -p images && pdftoppm -jpeg -jpegopt quality=100 -r 300 mypdf.pdf images/pg
    
  3. If you need more resolution, you can try 600 DPI:

    mkdir -p images && pdftoppm -jpeg -r 600 mypdf.pdf images/pg
    
  4. ...or 1200 DPI:

    mkdir -p images && pdftoppm -jpeg -r 1200 mypdf.pdf images/pg
    

See the references below for more details and options.

References:

  1. [my answer] Convert PDF to image with high resolution
  2. [my answer] https://askubuntu.com/questions/150100/extracting-embedded-images-from-a-pdf/1187844#1187844

Keywords: ubuntu linux convert pdf to images; pdf to jpeg; ptdf to tiff; pdf2images; pdf2tiff; pdftoppm; pdftoimages; pdftotiff; pdftopng; pdf2png

Gulick answered 9/5, 2020 at 16:59 Comment(12)
For creating a single file do: pdftoppm -singlefile -jpeg -r 300 input.pdf output. I also share the frustration btw :). Maybe you can edit your answer to include this short version as well. Creating the dir might put people off.Marleen
@SteveChavez, I just updated my answer to show a shorter version too, and to explain what I'm doing with the mkdir -p images part of the command. Thanks.Gulick
@SteveChavez -singlefile, per the documentation, will "write only the first page and do not add digits". That means if your intent is to generate one long image of all the pages stacked vertically (or stitched horizontally), this will not work.Karlmarxstadt
@SteveChavez, I just confirmed what @Karlmarxstadt said. Using -singlefile caused pdftoppm to only convert the first page of a 46 pg PDF I just tested it on.Gulick
imagemagick did not work for me on Ubuntu 20.04. this worked like a charm!Lumbago
Can you just please remove the command that doesn't work so that future people who read quickly will not needlessly try it first? :-)) it doesn't add any value does itCesura
@matanster, I'd like to keep the broken command, because I do think it adds value, so I added some bold headings to make it super clear which command is broken and which one works. The value I think keeping the broken command adds is: 1) it makes that error string googlable so when others see that command is broken and they search for solutions, they can find this answer and see the alternative command option, 2) it reminds me I've seen this error before (I reference this answer all the time myself too), 3) it gives someone a chance who sees this to offer a fix for the convert command.Gulick
@matanster, also, keep in mind I wrote my answer 3 years after the other answer, which uses that convert command I find to be broken, was accepted as correct. I felt I needed to provide justification for why I'd add a new answer to a 3-yr-old question which already has an accepted and highly-upvoted answer. Only in the last month or so has my answer finally surpassed the original answer in votes.Gulick
pdftoppm did the trick for me. convert worked too, but the resulting jpeg was extremely blurry. With pdftoppm the result was way better. So I'd recommend it even if you don't have the problem of convert not working.Feverfew
you need to do sudo apt install poppler-utils to use pdftoppm commandEdwardedwardian
Damn, what's wrong with convert nowadays?Erdmann
@RickyRobinson, see in my answer: "here is the fix for imagemagick so convert will work". It's been broken for years--at least 3 or 4.Gulick
C
85

You can try ImageMagick's convert utility.

On Ubuntu, you can install it with this command:

$ sudo apt-get install imagemagick

Use convert like this:

$ convert input.pdf output.jpg
# For good quality use these parameters
$ convert -density 300 -quality 100 in.pdf out.jpg
Cinereous answered 29/3, 2017 at 6:31 Comment(9)
If you get an error like convert: not authorized 'filename.pdf' @ error/constitute.c/ReadImage/412., then you need to modify /etc/ImageMagick-6/policy.xml or just temporarily rename it.Semiannual
Unfortunately this doesn't work for me. See the pdftoppm answer for a more effective solution.Marleen
I needed this to get a high-quality JGP instead of a low resolution one: convert -density 300 -quality 100 in.pdf out.jpgBerky
@Semiannual what modifications are needed in policy.xml ?Bushmaster
@Guus : In my policy.xml, I need to comment out the line <policy domain="coder" rights="none" pattern="PDF" />. But I usually prefer just temporarily renaming the file when I come across this error.Semiannual
@Semiannual Seriously that file is a pain in my side I wish by default there were no such limitations that necessitated me having to dig around to figure out why it won't do what I ask so oftenSunbreak
what kind of monstrosity is that? Why is there a "policy" in the first place? What are these stupid rules protecting against?Gone
Imagemagic doesn't work on many distros and making it work on Linux causes security issues. Please stop recommending it.Auteur
don't use the -flatten, and -trim options on multipage printable pdf conversion -flatten - puts all pages over on a single image -trim - cuts the white margin up to the textPatrizia
D
21

libvips can convert PDF -> JPEG quickly. It comes with most linux distributions, it's in homebrew on macos, and you can download a windows binary from the libvips site.

This will render the PDF to a JPG at the default DPI (72):

vips copy somefile.pdf somefile.jpg

You can use the dpi option to set some other rendering resolution, eg.:

vips copy somefile.pdf[dpi=600] somefile.jpg

You can pick out pages like this:

vips copy somefile.pdf[dpi=600,page=12] somefile.jpg

Or render five pages starting from page three like this:

vips copy somefile.pdf[dpi=600,page=3,n=5] somefile.jpg

The docs for pdfload have all the options.

With this benchmark image, I see:

$ /usr/bin/time -f %M:%e convert -density 300 r8.pdf[3] x.jpg
276220:2.17
$ /usr/bin/time -f %M:%e pdftoppm -jpeg -r 300 -f 3 -l 3 r8.pdf x.jpg
91160:1.24
$ /usr/bin/time -f %M:%e vips copy r8.pdf[page=3,dpi=300] x.jpg
149572:0.53

So libvips is about 4x faster and needs half the memory, on this test at least.

Drench answered 7/6, 2020 at 15:37 Comment(1)
Much faster than pdf2ppm, mupdf and ghostscript!Economizer
D
12

Convert from imagemagick seems to do a good job:

convert file.pdf test.jpg 

and in case there were multiple files generated:

convert test-0.jpg -append test-1.jpg ... -append one.jpg

to generate a single file, where all pages are concatenated.

Deadeye answered 29/3, 2017 at 6:40 Comment(1)
convert keywords are specified with a single dash, i.e. -append test-1.jpg etc.Godber

© 2022 - 2024 — McMap. All rights reserved.