How to repair a PDF file and embed missing fonts
Asked Answered
S

3

21

I use pdftk to repair some failures in corrupted PDF files, but I faced another problem which is not fixed by pdftk (or at least I do not know how to do so).

I have PDF files with text based on TrueType fonts, but the fonts have not been embedded during PDF creation. Now I want to embed the required fonts to the existing files.

Is there a command-line tool (like pdftk) to embed missing fonts by providing path to TTF files?

Silberman answered 12/10, 2012 at 11:14 Comment(0)
K
34

You can use Ghostscript to embed missing fonts. Run the command like this:

gs                                             \
  -o file-with-embedded-fonts.pdf              \
  -sDEVICE=pdfwrite                            \
  -dEmbedAllFonts=true                         \
  -sFONTPATH="/path/to/ttf;/other/path/to/ttf" \
   input-without-embedded-fonts.pdf

See also this answer:

Krall answered 29/10, 2012 at 23:23 Comment(7)
On win32, if you have installed ghostScript, the command may look like: gswin32c -sFONTPATH=C:\Windows\Fonts -o output-pdf-with-embedded-fonts.pdf -sDEVICE=pdfwrite -dPDFSETTINGS=/prepress input-pdf-where-some-fonts-are-not-embedded.pdf (find the exe file on your system, maybe add it to PATH -- the environment variable, if necessary)Bhayani
@Qtax: not needed I think -- this is supposed to be the default setting for Ghostscript anyway when embedding fonts.Krall
@KurtPfeifle you are right! Removing that comment (and this one), and adding one to make people aware that font subsetting is done.Pangaro
gs does font subsetting by default when embedding fonts (that is only embedding the glyphs of the font that are used in the PDF). Can be disabled with -dSubsetFonts=false.Pangaro
This breaks PDF files that have forms in them. Does anyone know of the gs flag that will preserve the forms?Whoopee
@Fuhrmanator: FOSS software and PDF forms -- a long chapter in the book "List of important functionality missing or sucking in FOSS"....Krall
Note that this will not embed standard fonts. If you want to embed those as well, then you need to add -c "<</NeverEmbed []>> setdistillerparams" -f.Mutualism
S
7

I just had the same problem (on Ubuntu 14.04) and I found the following solution:

  • install Acrobat Reader
  • print "print to file" into a postscript file ("foo.ps") and "advanced -> print as image"
  • then on the console use ps2pdf foo.ps foo.pdf and the result is a file with embedded fonts and the original content

The intermediate postscript file is much bigger (650KB) than the input file (56KB) but the resulting PDF is moderate in size again (82KB).

I do not know why this works, i.e.,

  • why "print as image to file" seems to create an image but also preserves font information,
  • why ps2pdf recovers this font information, and
  • why there are fonts in the resulting PDF at all because it should only be an image, right?.

But the result is a PDF with all fonts embedded and a size similar to the original file.

Sleuthhound answered 25/8, 2014 at 10:22 Comment(3)
It worked for me by just printing to ps-file, without saving it as an image. Some people complain that pdf -> ps -> pdf conversion is not the way to go, but you seriously rescued my PhD thesis from doom with this post.Herrera
I like your approach but changed it in my case. I did not want to use Acrobat so I just did it with evince (standard Gnome PDF reader) print to File, chose Post Script (no image option needed) and then ps2pdf the resulting pdf again. That worked and I did not need to search for the paths for the fonts as necessary for the other answer (https://mcmap.net/q/583554/-how-to-repair-a-pdf-file-and-embed-missing-fonts).Israel
Using ps2pdf directly on the damaged pdf worked for me for fixing fonts: ps2pdf foo.pdf foo_fixed.pdfAnton
S
2

As mentioned by @t-bltg in a comment, Ghostscript now comes with a ps2pdf command which will embed missing fonts automatically.

ps2pdf -sFONTPATH="." in.pdf out.pdf

That should be all you need.

Troubleshooting

  • The above command presumes the necessary .ttf or .otf file is in the current working directory ("."). If it is not, you can change . to directory where the font file lives. For example:

    ps2pdf -sFONTPATH="~/.local/share/fonts/" input.pdf output.pdf
    

    (Note, you must specify a directory, not a font file.)

  • Make sure that the fontname the PDF is looking for is the same as the name embedded in the font file. For example, if the PDF wants a font named "Fantasy Bold", but the font file defines a font named "Fantasy" or "Fantasy Bold Neue LT Pro", this method will not work.

Bonus: The 14 type families

Even without specifying FONTPATH, Ghostscript knows about the 14 standard Postscript Level 2 fonts, including their old names. For example, modern systems sometimes do not render "New Century Schlbk" correctly because they expect "New Century Schoolbook". A simple ps2pdf in.pdf out.pdf solves the problem.

Seminarian answered 8/1, 2023 at 3:10 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.