Replace all font glyphs in a PDF by converting them to outline shapes
Asked Answered
M

3

23

I am looking for a way to 'outline' all text/fonts in a PDF file, i.e. convert them to curves.

I would prefer to do this without having to convert the PDF to PostScript and back. Also, I would like to use free lightweight cross-platform tools that can be automated from the command line, such as Ghostscript or MuPDF.

Marv answered 1/3, 2015 at 18:36 Comment(4)
LaTeXiT can do this and I believe it uses GhostScript (not sure). I tried to dig through the source and find how it does it but didn't succeed.Marv
Ghostscript can do this, now, but it couldn't readily do so previously (you would have had to go via PostScript). I've added the information as an answer below.Boccherini
PDF-TEXT-To-Outlines with adblocker seems to work well for one off privacy insensitive documents.Cyanamide
@tejasvi88 However, it is not a command line tool that can easily be automated, which is what I was looking for.Marv
M
45

Yes, you can use Ghostscript to achieve what you want.

I. For Ghostscript versions up to 9.14

You need to go through 2 steps:

  1. Convert the PDF to a PostScript file, but use the side effect of a relatively unknown parameter: it is called -dNOCACHE. This will convert all used fonts to outline shapes:

    gs -o somepdf.ps -dNOCACHE -sDEVICE=pswrite somepdf.pdf
    
  2. Convert the PS back to PDF (and, maybe delete the intermediate PS again):

    gs -o somepdf-with-outlines.pdf -sDEVICE=pdfwrite somepdf.ps
    
    rm somepdf.ps
    

This method is not reliable long-term, because the Ghostscript developers have stated that -dNOCACHE may not be present in future versions.

Note: the resulting PDF will very likely be larger than the original one. Plus, without additional command line parameters, all images in the original PDF will likely also be processed according to Ghostscript builtin defaults. This can lead to unwanted side-effects. Those side-effects can be avoided by adding more command line parameters to do otherwise.


II. Ghostscript versions 9.15 or newer

Ghostscript version 9.15 (released in September 2014) supports a new command line parameter:

 -dNoOutputFonts

This will cause the output devices pdfwrite, ps2write and eps2write "to 'flatten' glyphs into 'basic' marking operations (rather than writing fonts to the output)".

This means: the two steps described for pre-9.15 GS versions can be avoided. The desired result can be achieved with a single command:

 gs -o file-with-outlines.pdf -dNoOutputFonts -sDEVICE=pdfwrite file.pdf

Note: the same caveat is true as already noted in part I. If your PDF includes images, there may be unwanted side effects introduced by the simple command line above. To avoid these, you need to add more specific parameters.

Mexicali answered 1/3, 2015 at 20:0 Comment(3)
hey Kurt, Actually I have created a photobook pdf with images, captions and emojis.. And I need to print the pdf. What is the ideal way to covert any photobook pdf to the "print-ready" pdf format.. What are the options to use in ghostscript? Can you guide me or point to some resources? Thanks a lot in advance. Actually I tried outlines the fonts in my photobook pdf via command you mentioned in this answer.. it works fine. But since this pdf contains images, emojis, text.. Am not sure is the exact command? or I need to use some extra options on the longer run... ?Indicia
@Kurt, nice answer, you really should add the link to another answer by you, about how to keep the raster image resolution: superuser.com/a/373740/207447Grolier
Add a related document reference for -dNoOutputFonts. But note the new output PDF created by Ghostscript is not necessarily much more "intelligent" (overall smaller, better optimized files from bloated input PDF) with default settings. See also How to remove duplicate objects in PDF using ghostscript?Fastidious
B
11

This commit adds a new switch -dNoOutputFonts to the Ghostscript pdfwrite and ps2write devices which will produce a PDF file (or PostScript, depending on the selected device) where all the glyphs have been created as vectors, not as text.

You will need at least version 9.15 of Ghostscript to get this feature. Be aware that the PDF file will almost certainly be larger and copy/paste/search will (obviously) not work.

Boccherini answered 1/3, 2015 at 19:51 Comment(3)
Yes, I tested, I found that the cause for larger size was not just in convert fonts to outline shapes/vectors/curves. For example, I had a PDF with one watermask image embedded and referenced/indirectly used on each page. After ghostscript, I found the output PDF contained duplicated images on each page using itext-rups-7.1.11.jar. ``` Pages: ... Page 3 124 0 R => Image Stream Page 4 171 0 R => Image Stream ... XRef: ... 124 => Image Stream 171 => Image Stream ... ```Fastidious
The comment above doesn't seem to be anything to do with the original question or answer. samm, if you have a problem, please start a new question. For other readers, Ghostscript's pdfwrite device (by default) will hash all images, and only use one if they are identical. Of course samm has not provided an input file, a command line, an output file or even informaiton on which OS or version of Ghostscript, which makes it impossible to investigate or comment.Boccherini
Well, it seems to have little to do with converting texts to curves without fonts embedded. I just wanted to add a note about larger size of the output PDF file if someone is concerned with the size. I used gs v9.52 on windows 10 by ` gs -o book.vectored.pdf -dNoOutputFonts -sDEVICE=pdfwrite book.optimized.pdf` and the pdf had 300+ of pages. I used the same optimization algorithm to book.vectored.pdf as was used to book.optimized.pdf, I could reduce the size by 10 MB.Fastidious
O
0

III. Ghostscript versions 9.54.0 (Windows 10)

I found a method that preserves all fonts flawlessly as vectors without any visual errors and with just two printing steps, after Ghostscript is first installed and configured correctly.

(Note! You must Add the Ghostscript bin-/ and lib-folder to your windows PATH in order to get Ghostscript to do anything) Instructions here

  1. Print your PDF-file that contains vector based fonts or other vector elements with Acrobat Reader and using Microsoft PS Class Driver to a YourFile.prn file. (To install this driver -- Control Panel - Devices - Printers & Scanners - Add a Printer or scanner -- and let first Windows to look for a while for a connected printer, and when it stops select an option -- The printer that I want is not listed - Add a local printer or network printer with manual settings - Next - Use an existing port: > File:(Print to File) - Next - Microsoft: Microsoft PS Class Driver - Next)

  2. Open Command prompt, navigate to the folder where YourFile.prn file is located and type: "C:\Program Files\gs\gs9.54.0\bin\gswin64c.exe" -dNOPAUSE -dNOCACHE -dBATCH -sDEVICE=eps2write -sOutputFile=YourFile.eps YourFile.prn

If you have a constant need to do this you can also create prn2eps.bat file containing the following:

"C:\Program Files\gs\gs9.54.0\bin\gswin64c.exe" -dNOPAUSE -dNOCACHE -dBATCH -sDEVICE=eps2write -sOutputFile=%1.eps %1.prn

To use that bat file you just need to type: prn2eps YourFile. (Note! you must have the bat file and Yourfile.prn in the same directory)

For some reason newest Ghostscript ps2epsi function didn't work in Windows 10, and Adobe made PDF:s had e.g. minor but consistent errors in some font characters when I imported them in non-Adobe design software as PDF:s. I have found out during the years that EPS-file format is one of the most reliable formats when vectors must be preserved from one software to another. Many times printing PDF again to PDF using just another printer driver may be enough or single file format change using Ghostscript, but not always.

Oscillogram answered 22/6, 2021 at 21:10 Comment(3)
Solution "II" form the accepted answer does work in Ghostscript 9.54 just as before (I use it regularly). The other answers did not rely on GSView. I am not sure what issue your answer is trying to address.Marv
I did try that solution, but for some reason some specific fonts still had some errors (some disformed characters, as if some vertices or control vectors were missing) in them, which were fixed only when printing first PS with Windows 10 own driver, and then converting that to EPS. I have used Ghostscript for decades to fix all kind odd visual errors in vector file conversions, it's a great tool! Gsview just made it super easy to use, since it had a graphical UI, and that's no longer available.Oscillogram
It will be helpful to readers if you explain (within the answer itself) what problem your solution is meant to address.Marv

© 2022 - 2024 — McMap. All rights reserved.