Best way to convert pdf files to tiff files [closed]
Asked Answered
C

13

80

I have around 1000 pdf filesand I need to convert them to 300 dpi tiff files. What is the best way to do this? If there is an SDK or something or a tool that can be scripted that would be ideal.

Checkerberry answered 16/9, 2008 at 18:32 Comment(6)
This is the solution that I use: [Pdf to Tiff using Xpdf's pdftoppm and LibTIFF's ppm2tiff and tiffcp (optional, only if multipage)][1] [1]: stackoverflow.com/a/12868254/551460Irrelative
any final solution with full source code sample ? maybe using powershell script..Rasping
@Rasping I posted one solution using powershell. See it below...Checkerberry
Use Ghrostscript as gs -q -dNOPAUSE -r300x300 -sDEVICE=tiff24nc -sOutputFile=output.tif input.pdf -c quit (on Windows the command is gswin32c) to produce 300x300 dpi and 24bit color imageTarim
Best way to convert PDF files to TIFF files? For sure: use pdftoppm, as follows: mkdir images && pdftoppm -tiff -r 300 mypdf.pdf images/pg. See here for details, usage, & more info: askubuntu.com/questions/150100/….Outrun
This is somewhat of a cop-out answer, but I tried Ghostscript and didn't have good success. On the other hand, Adobe Acrobat had an option to export the PDF as .tif files, and it worked perfectly for my needs. It output settings can also be adjusted.Anaximander
C
74

Use Imagemagick, or better yet, Ghostscript.

http://www.ibm.com/developerworks/library/l-graf2/#N101C2 has an example for imagemagick:

convert foo.pdf pages-%03d.tiff

http://www.asmail.be/msg0055376363.html has an example for ghostscript:

gs -q -dNOPAUSE -sDEVICE=tiffg4 -sOutputFile=a.tif foo.pdf -c quit

I would install ghostscript and read the man page for gs to see what exact options are needed and experiment.

Campy answered 16/9, 2008 at 18:39 Comment(7)
ghostscript works really good, as far as i understand imagemagick is reusing ghostscript for pdf operations. Is this correct?Checkerberry
that's what I hear, but I'm not an expert on ImageMagick internals ;)Campy
does imagemagick handle multipage pdf --> tiff properly?Kristoforo
wow, ghostscript really needs to clean up its command line interface!Carder
imagemagick worked well without configuration. I could not configure ghostscript properly to get a high resolution colour image.Reinwald
convert foo.pdf pages-%03d.tiff produces horribly-low-quality images. How do we increase the resolution to be what is already in the pdf, so no resolution is lost?Outrun
After tons and tons and tons of research, here's what I've decided is best: pdftoppm: askubuntu.com/questions/150100/…. Also, in case the goal if making TIFFs here is to use tesseract to convert the PDF to a searchable pdf via OCR, I've done that now too and written an interface to do it in one step: pdf2searchablepdf input.pdf--see here: askubuntu.com/questions/473843/….Outrun
B
46

Using GhostScript from the command line, I've used the following in the past:

on Windows:

gswin32c -dNOPAUSE -q -g300x300 -sDEVICE=tiffg4 -dBATCH -sOutputFile=output_file_name.tif input_file_name.pdf

on *nix:

gs -dNOPAUSE -q -g300x300 -sDEVICE=tiffg4 -dBATCH -sOutputFile=output_file_name.tif input_file_name.pdf

For a large number of files, a simple batch/shell script could be used to convert an arbitrary number of files...

Bartender answered 22/9, 2008 at 5:14 Comment(4)
+1. Useful command. But my color figure is outputting in black and white. Any idea why?Sibella
-sDEVICE=tiffg4 is a black and white fax compression model. See: pages.cs.wisc.edu/~ghost/doc/AFPL/8.00/Devices.htm#TIFFHaematoma
Most of the time you want to convert a pdf to TIFF images of 300x300 dpi, not 300x300 size. For this reason, replace -g switch with -r: gswin32c -dNOPAUSE -q -r300x300 ...Stoller
Thanks @HairyFotr. For anyone else visiting, you should be using -sDEVICE=tiff24nc for 24-bit RGB, or -sDEVICE=tiff12nc for 12-bit (8/4 bits per channel, respectively).Talca
C
19

I wrote a little powershell script to go through a directory structure and convert all pdf files to tiff files using ghostscript. Here is my script:

$tool = 'C:\Program Files\gs\gs8.63\bin\gswin32c.exe'
$pdfs = get-childitem . -recurse | where {$_.Extension -match "pdf"}

foreach($pdf in $pdfs)
{

    $tiff = $pdf.FullName.split('.')[0] + '.tiff'
    if(test-path $tiff)
    {
        "tiff file already exists " + $tiff
    }
    else        
    {   
        'Processing ' + $pdf.Name        
        $param = "-sOutputFile=$tiff"
        & $tool -q -dNOPAUSE -sDEVICE=tiffg4 $param -r300 $pdf.FullName -c quit
    }
}
Checkerberry answered 23/9, 2008 at 10:55 Comment(1)
After 7 years, this continues to be helpful! I would only add that a person who has no PowerShell experience, you need to: 1. Edit the $tool value to match the path and version on your system. 2. Open PowerShell and cd to the directory where the PDFs are stored. 3. Paste the code into the PowerShell window. I needed to press enter a couple times after to get it to run. Thanks gyuriscHoarfrost
G
9

1) Install GhostScript

2) Install ImageMagick

3) Create "Convert-to-TIFF.bat" (Windows XP, Vista, 7) and use the following line:

for %%f in (%*) DO "C:\Program Files\ImageMagick-6.6.4-Q16\convert.exe" -density 300 -compress lzw %%f %%f.tiff

Dragging any number of single-page PDF files onto this file will convert them to compressed TIFFs, at 300 DPI.

Grog answered 24/9, 2010 at 19:1 Comment(3)
GhostScript is required ? If I only install ImageMagick ?Rasping
This worked perfectly. Thanks a lot.Viborg
How can we change color to greyscale or any other similar colors? Also it repeats the file name while saving. I'm using it on Windows 10Cobaltic
S
7

using python this is what I ended up with

import os
os.popen(' '.join([
                   self._ghostscriptPath + 'gswin32c.exe', 
                   '-q',
                   '-dNOPAUSE',
                   '-dBATCH',
                   '-r300',
                   '-sDEVICE=tiff12nc',
                   '-sPAPERSIZE=a4',
                   '-sOutputFile=%s %s' % (tifDest, pdfSource),
                   ]))
Svetlana answered 21/10, 2008 at 9:56 Comment(1)
Generally you'll want to use subprocess for this. os.popen is considered deprecated. The syntax is nearly the same.Shroff
D
3

How about pdf2tiff? http://python.net/~gherman/pdf2tiff.html

Diandrous answered 16/9, 2008 at 18:35 Comment(1)
this does not handle multipage tiffs yet, so unfortunately this is no go for me. Thanks for the suggestion though.Checkerberry
S
3

ABCPDF can do so as well -- check out http://www.websupergoo.com/helppdf6net/default.html

Sottish answered 16/9, 2008 at 18:42 Comment(0)
H
3

The PDF Focus .Net can do it in such way:

1. PDF to TIFF

SautinSoft.PdfFocus f = new SautinSoft.PdfFocus();    

string pdfPath = @"c:\My.pdf";

string imageFolder = @"c:\images\";

f.OpenPdf(pdfPath);

if (f.PageCount > 0)
{
    //Save all PDF pages to image folder as tiff images, 200 dpi
    int result = f.ToImage(imageFolder, "page",System.Drawing.Imaging.ImageFormat.Tiff, 200);
}

2. PDF to Multipage-TIFF

//Convert PDF file to Multipage TIFF file

SautinSoft.PdfFocus f = new SautinSoft.PdfFocus();

string pdfPath = @"c:\Document.pdf";
string tiffPath = @"c:\Result.tiff";

f.OpenPdf(pdfPath);

if (f.PageCount > 0)
{
    f.ToMultipageTiff(tiffPath, 120) == 0)
    {
        System.Diagnostics.Process.Start(tiffPath);
    }
}   
Hourglass answered 2/12, 2011 at 8:12 Comment(0)
P
2

https://pypi.org/project/pdf2tiff/

You could also use pdf2ps, ps2image and then convert from the resulting image to tiff with other utilities (I remember 'paul' [paul - Yet another image viewer (displays PNG, TIFF, GIF, JPG, etc.])

Plio answered 16/9, 2008 at 18:42 Comment(0)
C
2

Disclaimer: work for product I am recommending

Atalasoft has a .NET library that can convert PDF to TIFF -- we are a partner of FOXIT, so the PDF rendering is very good.

Churl answered 19/9, 2008 at 0:3 Comment(0)
F
2

Required ghostscript & tiffcp Tested in Ubuntu

import os

def pdf2tiff(source, destination):
    idx = destination.rindex('.')
    destination = destination[:idx]
    args = [
    '-q', '-dNOPAUSE', '-dBATCH',
    '-sDEVICE=tiffg4',
    '-r600', '-sPAPERSIZE=a4',
    '-sOutputFile=' + destination + '__%03d.tiff'
    ]
    gs_cmd = 'gs ' + ' '.join(args) +' '+ source
    os.system(gs_cmd)
    args = [destination + '__*.tiff', destination + '.tiff' ]
    tiffcp_cmd = 'tiffcp  ' + ' '.join(args)
    os.system(tiffcp_cmd)
    args = [destination + '__*.tiff']
    rm_cmd = 'rm  ' + ' '.join(args)
    os.system(rm_cmd)    
pdf2tiff('abc.pdf', 'abc.tiff')
Fury answered 22/9, 2011 at 7:57 Comment(0)
S
1

I like PDFTIFF.com to convert PDF to TIFF, it can handle unlimited pages

Stackhouse answered 11/3, 2010 at 20:34 Comment(1)
This site no longer exists.Nickelson
A
1

Maybe also try this? PDF Focus

This .Net library allows you to solve the problem :)

This code will help (Convert 1000 PDF files to 300-dpi TIFF files in C#):

    SautinSoft.PdfFocus f = new SautinSoft.PdfFocus();

    string[] pdfFiles = Directory.GetFiles(@"d:\Folder with 1000 pdfs\", "*.pdf");
    string folderWithTiffs = @"d:\Folder with TIFFs\";

    foreach (string pdffile in pdfFiles)
    {
        f.OpenPdf(pdffile);

        if (f.PageCount > 0)
        {
            //save all pages to tiff files with 300 dpi
            f.ToImage(folderWithTiffs, Path.GetFileNameWithoutExtension(pdffile), System.Drawing.Imaging.ImageFormat.Tiff, 300);
        }
        f.ClosePdf();
    }
Abb answered 9/11, 2011 at 12:52 Comment(1)
You should specify "pdf focus" comes with a price.Nickelson

© 2022 - 2024 — McMap. All rights reserved.