You can also merge multiple PDFs with Ghostscript. The big advantage of this route is that a solution is easily scriptable, and it does not require a real programming effort:
gswin32c.exe ^
-dBATCH -dNOPAUSE ^
-sDEVICE=pdfwrite ^
-sOutputFile=merged.pdf ^
[...more Ghostscript options as needed...] ^
input1.pdf input2.pdf input3.pdf [....]
With Ghostscript you'll be able to pass pdfmark statements which can add a Table of Content as well as bookmarks for each additional source file going into the resulting PDF. For example:
gswin32c.exe ^
-dBATCH -dNOPAUSE ^
-sDEVICE=pdfwrite ^
-sOutputFile=merged.pdf ^
[...more Ghostscript options as needed...] ^
file-with-pdfmarks-to-generate-a-ToC.ps ^
-f input1.pdf input2.pdf input3.pdf [....]
or
gswin32c.exe ^
-dBATCH -dNOPAUSE ^
-sDEVICE=pdfwrite ^
-sOutputFile=merged.pdf ^
[...more Ghostscript options as needed...] ^
file-with-pdfmarks-to-generate-a-ToC.ps ^
-f input1.pdf ^
input2.pdf ^
input3.pdf [....]
For some introduction to the pdfmark topic, see also Thomas Merz's PDFmark Primer.
Edit:
I had wanted to give you an example for file-with-pdfmarks-to-generate-a-ToC.ps
, but somehow forgot it. Here it is:
[/Page 1 /View [/XYZ null null null] /Title (File 1) /OUT pdfmark
[/Page 2 /View [/XYZ null null null] /Title (File 2) /OUT pdfmark
[/Page 3 /View [/XYZ null null null] /Title (File 3) /OUT pdfmark
[/Page 4 /View [/XYZ null null null] /Title (File 4) /OUT pdfmark
This would create a ToC for the first 4 files == first 4 pages (since you guarantee your ingredient files are 1 page each for your merged output PDF).
- The
[/XYZ null null null]
part makes sure your page viewport and zoom level does not change from the current one when you follow the link. (You could say [/XYZ 222 111 2]
to do this, if you want an arbitrary example.)
- The
/Title (some string you want)
thingie determines what text is in the ToC.
And, you could even add these parameters to the Ghostscript commandline directly:
gswin32c.exe ^
-o merged.pdf ^
[...more Ghostscript options as needed...] ^
-c "[/Page 1 /View [/XYZ null null null] /Title (File 1) /OUT pdfmark" ^
-c "[/Page 2 /View [/XYZ null null null] /Title (File 2) /OUT pdfmark" ^
-c "[/Page 3 /View [/XYZ null null null] /Title (File 3) /OUT pdfmark" ^
-c "[/Page 4 /View [/XYZ null null null] /Title (File 4) /OUT pdfmark" ^
-f input1.pdf ^
input2.pdf ^
input3.pdf ^
input4.pdf [....]
'nother Edit:
Oh, and by the way: Ghostscript does preserve the bookmarks when you use it to merge two PDF files into one -- pdftk.exe does not. Let's use the one generated by the command of my first edit (effectively concatenating 2 copies of the same file):
gswin32c ^
-sDEVICE=pdfwrite ^
-o doublemerged.pdf ^
merged.pdf ^
merged.pdf
The file doublemerged.pdf
will now have 2*4 = 8 bookmarks.
- What's as expected: bookmarks 1, 2, 3, and 4 link to pages 1, 2, 3 and 4.
- The problem is, that bookmarks 5, 6, 7 and 8 also link at pages 1, 2, 3 and 4.
The reason is, that the pre-existing bookmarks did address their link targets by absolute page numbers. To work around that (and bookmarks work in merged files), one would have to generate bookmarks which do point to link targets by named destinations (and make sure these are uniq across documents which are merged).
(This approach also works on linux, just use gs instead of gswin32c.)
Appendix
Above command line uses [...more Ghostscript options as needed...]
as a place holder for more options.
If you do not use other options, Ghostscript will apply its built-in defaults for various parameters. However, this may give you results which may not to your liking. Since Ghostscript generates a completely new PDF based on the input, this means that some of the original objects may be changed. This is true for color spaces and for image compression levels.
How to apply parameters which leave the originally embedded images unchanged can be seen over at SuperUser: "Use Ghostscript, but tell it to not reprocess images".
update_info
. See the answer of steventaitinger. – Coverdale