pdftk split pdf with multiple pages but total size grew
Asked Answered
P

6

30

with php i have to split a single pdf file with multiple pages inside it to a lot of PDF file with one page per file. I use pdftk and works fine, but every pdf created for every page is very large size. My original PDF is 7MB (with 70pages inside), the sum of every file created by splitting with pdftk is over 70MB.

Someone know if there is a property to set for pdftk to have small file output?

Policeman answered 15/11, 2013 at 0:17 Comment(1)
A very good solution is to use cloudinary for split and retrieve pdf preview like images.. cloudinary.comPoliceman
S
37

You could always specify the compress option - for example:

pdftk input.pdf burst output output_%02d.pdf compress

Note that pdftk just copies the content of your PDF files from the inputs into the outputs, and can't do very much to optimize away bloat. So if your input PDFs are large/complicated, your output PDFs will be also. Also note that any fonts embedded in the document may end up being duplicated in each output document, taking up more space.

Shiekh answered 15/11, 2013 at 0:23 Comment(2)
@Simone, as pobrelkey said, you may have had common resources in original file, like fonts or background image, which now are 70 times duplicated. Maybe a single page sample will help someone to suggest ways to optimize. And note, pdftk doesn't compress to a maximum (using 1.5 features like compressed xref table and object streams, thought it won't give 70-fold compression, of course)Lotz
ok, i try with this and solved my problem: pandemoniumillusion.wordpress.com/2008/05/07/…Policeman
V
20

You may use pdftk and try

pdftk source.pdf cat 1-100 output try1.pdf
pdftk source.pdf cat 101-end output try2.pdf
Voltz answered 8/3, 2015 at 22:44 Comment(0)
C
11

When splitting PDF files, it's sometimes hard to avoid information which is only required by some pages being included in each output file.

cpdf tries hard to avoid this -- you can try it and see what happens. You might find it's no better than pdftk on your file, but it should be.

Disclosure: I am the author of cpdf.

Cayenne answered 15/11, 2013 at 13:51 Comment(0)
R
0

Had a similar problem. But does not apply 1:1 to the question. Anyways somebody might find it useful:

  1. I had a very big pdf file - original.pdf - of more than 240MB. It was almost impossible to use it. I printed it out with evince as a pdf and removed any scaling in the printer setup. This generated a file - new.pdf - of around 102MB! Obviously all the embedded fonts, bookmarks and so on were removed.
  2. To get the bookmarks back I used cpdf to extract the bookmarks from the original pdf document and applied it to the new one. The resulting document - result.pdf - is easy to navigate and very quick in any pdf viewer.

Reference: cpdf to extract and apply bookmarks: http://www.coherentpdf.com/cpdfmanual/node38.html

cpdf -list-bookmarks original.pdf > booksmarks.txt
cpdf -add-bookmarks booksmarks.txt new.pdf -o result.pdf
Richma answered 9/11, 2015 at 22:9 Comment(0)
D
-1

I have same problem and I have tested both program PdfTk et cPDF found in these answer.

My PDF file's size is 5744k.

Using following PDFTK command

I obtain a 501k file.

set pdftk="C:\Program Files (x86)\Tools\PDFtk\bin\pdftk.exe"
%pdftk% "RY18BPSA.UserManual.pdf" CAT 1 9-15 220 output "RY18BPSA.PDFTK.pdf"

Using following CPDF command

set cpdf="C:\Program Files\Tools\cpdf\cpdf.exe"
%cpdf% "RY18BPSA.UserManual.pdf" 1,9-15,220 -o "RY18BPSA.CPDF.pdf"

I obtain a 592k file.


Just for the fun, I have also tested to print desired pages directly to Microsoft to PDF pseudo printer and I have obtain a 250k file !

The only differences that I can quickly see with other generated files is that page format as been changed and replaced by an A4 page AND that can only be done manually using Print GUI of PDF program as Acrobat Reader or Foxit Reader.

PS: I can make a search on all generated splitted files !

Deranged answered 22/5, 2021 at 6:30 Comment(0)
K
-1

I had a similar problem and I've tried many different tools and I realized that, even if sometimes the compression of the original file doesn't seem to work, the outcome of the split (or burst) can be heavily reduced after using some of them. The solution that worked better for me was the combination of these two steps:

  1. Compress your original file with pdf2go (basic compression worked for me). It also worked printing it to a new file with evince, as suggested in another answer, but it worked worse in my example. The size of the file may not be reduced at all (in my case it even increased) but still the output files after burst are much smaller.

  2. Use pdftk with compress option:

    pdftk input.pdf burst output output_%02d.pdf compress

Kurland answered 7/6, 2022 at 21:19 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.