You could try to 'repair' your pdftk-concatenated PDF using Ghostscript (but use a recent version, such as 9.05). In many cases Ghostscript will be able to merge the many subsetted fonts into fewer ones.
The command would look like this:
gswin32c.exe ^
-o output.pdf ^
-sDEVICE=pdfwrite ^
-dPDFSETTINGS=/prepress ^
input.pdf
Check with
pdffonts.exe output.pdf
pdffonts.exe input.pdf
how many instances of various font subsets are in each file (pdffonts.exe
is available here as part of a small package of commandline tools).
But don't complain about the 'slow speed' of this process -- Ghostscript does interprete completely all PDF input files to accomplish its task, while the pdftk file concatenation is a much simpler process...
Update:
Instead of pdftk
you could use Ghostscript to merge your input PDF files. This could possibly avoid the problem you was seeing with the a posteriori Ghostscript 'repair' of your pdftk-merged files. Note, this will be much slower than the 'dumb' pdftk merge. However, the results may please you better, especially regarding the font handling and file size.
This would be a possible command:
gswin32c.exe ^
-o output.pdf ^
-sDEVICE=pdfwrite ^
-dPDFSETTINGS=/prepress ^
input.pdf
You can add more options to the Ghostscript CLI for a more fine-tuned control over the merge and optimization process.
In the end you'll have to decide between the extremes:
- 'Fast'
pdftk
producing large output files, vs.
- 'Slow'
gswin32c.exe
(Ghostscript) producing lean output files.
I'd be interested if you would post some results (execution time and resulting file sizes) for both methods for a number of your merge processes...
Update 2: Sorry, my previous version contained a typo.
It's not -sPDFSETTINGS=...
but it must be -dPDFSETTINGS=...
(d in place of s).
Update 3:
Since your source files are Excel sheets made from templates (which usually don't use a lot of different fonts), you could try to use a trick to make sure Ghostscript has all the required glyphs of the fonts used in all to-be-merged-later PDFs:
- For each font and face (standard, italic, bold, bold-italic) add a table cell into your template sheet at the top left of your print area.
- Fill this table cell with all printable characters and punctuation signs from the ASCII alphabet:
0123456789
, ABCD...XYZ
, abc...xyz
, :-_;°%&$§")({}[]
etc.
- Make the cell (and the fontsize) as small as you want or need in order to not disturb your overall layout. Use the color white to format the characters in the cell (so they appear invisible in the final PDF).
This method will hopefully make sure that each of your PDFs will use the same subset of glyphs which would then avoid the problems you observed when merging the files with Ghostscript. (Note, that you if you use f.e. Arial and Arial-Italic, you have to create 2 such cells: one formatted with the standard Arial typeface, the other one with the italic one.)
pdffonts input.pdf
for a few of your input files, as well aspdffonts output.pdf
for the file whichpdftk
created from the same set of inputs? – Hotchpot