What are PostScript dictionaries, and how can they be accessed (via Ghostscript)?
Asked Answered
L

5

8

I usually look at ghostscript as a command line tool; however, I never cease to be amazed at the sheer amount of settings and options present there - which is due to the fact that ghostscript is a full blown PostScript language interpreter (which I often forget).

For instance, in Querying Ghostscript for the default options/settings of an output device (such as 'pdfwrite' or 'tiffg4'); one learns how to retrieve default options for a given output device. However, what I'd like to know is - are these options related to so-called PostScript dictionaries?

Or, to put it in other words - what are PostScript dictionaries; and what facilities does ghostscript have, to query (and possibly) modify this data?

Levigate answered 21/6, 2012 at 12:3 Comment(0)
D
11

To put it in the most simple terms: In PostScript, a dictionary is a list of key (name) + value pairs. Dictionaries allow the PostScript interpreter to lookup if a key exists and fetch its value to use it in any procedure. The interpreter also can create keys, store or modify values and even create complete custom dictionaries (dictated by the PostScript code its processing). Keys usually are of type name (but they may be of any other type as well with the exception of null).

Two of these dictionaries must always be present, for any implementation of a PostScript interpreter:

  • systemdict This one holds pre-defined PostScript operators (and the implementations to make them do what the PostScript specification expects them to).

  • userdict This one holds variables and procedures of a PostScript program (think of 'procedures' as being functions or subroutines which are constructed by the combination of language-defined operators and program-defined values and parameters).

One word about names: names are what to other programming languages are uniq identifiers (and they are case-sensitive). These identifiers may be variables or procedure names. They may be made up of any combination of the 256 characters of ASCII (but they are no strings).

As you may be aware, PostScript is a stack-oriented language. It uses several stacks:

  • operand stack This stack holds every single operand and every result of intermediate operations (turning the last result temporarily into the top-most element of the operand stack).

  • dictionary stack As the name says: this stack holds only dictionaries. As such the stack defines the current context for any key/name lookup.

  • execution stack This one holds executable objects, i.e. mainly procedures and files which are currently being executed. If the interpreter interrupts the execution of a current object, it puts the interrupted object onto this stack. After an object was completely executed, it is removed from the stack and execution continues with the one that is top-most now.

  • graphics state stack This stack hosts the current context for the ejection of graphical elements: current line width setting, current font, current color or grayscale value, current path... Current graphic states may be saved (gsave) and restored (grestore) later. The top-most graphics state is always the current graphics state.

All these stacks are independent from each other. However, the operand, dictionary and graphics state stacks are under the control of the PostScript program (that is, may be manipulated by it). The execution stack is the sole property of the interpreter.

For each stack there are certain limitations (as for the number of elements which may be stored on it, etc.). PostScript knows operators which can manipulate stacks: put a new element on the stack, remove the top-most element (pop), duplicate the top-most element (dup), shuffle the order of elements on the stack (roll), swap the two top-moste elements (exch), and quite some more (a good intro into PostScript programming is the 'Bluebook' from Adobe).

As I already said, dictionaries have their own stack which holds all dictionaries a PostScript interpreter may use.

On that stack there may be a separate dictionary of fonts, or any number of dictionaries a PostScript program wants to create (using the dict keyword) and use privately, or some dictionaries that are specific to a certain PostScript interpreter, such as Ghostscript.

The systemdict always is the bottom-most one; above this is the userdict. These two cannot be removed from the dictionary stack, wheres all the other ones can be subject to any stack manipulation operator (such as pop which removes the topmost element from a stack).

Whenever the interpreter is looking up a name, it searches the dictionaries for that name, starting with the top-most dictionary. Hence userdict is searched before systemdict. As soon as the name is found (a key), the interpreter stops searching and uses that key (or rather, the value it holds). The consequence of this architecture is that the PostScript programmer may overwrite any PostScript operator that is pre-defined in systemdict with his own variant.

Also, some dictionaries can be for the PS program 'private' (no-access, such as font dictionaries) or 'read-only'.


Update -- More answers:

Designer answered 21/6, 2012 at 14:0 Comment(3)
Many thanks for the detailed answer, @pipitas! If/when you have the time, could you also add a short ghostscript terminal example, of "the interpreter [] looking up a name" (which can also show "userdict is searched before systemdict", and possibly overwriting an operator)? Many thanks again - cheers!Levigate
@sdaau: my other answer shows you how to look up the name/key /screen in the dictionary .distillersettings: it's as simple as .distillersettings /screen get. The get operator puts the value of the key (if found) onto the operand stack (if not found an undefined error will be printed). Now that the key's value is on the stack, the rest of the code snippet just fetches it there, formats it a bit to make it look more nicely and prints out its content...Designer
Small correction. pop does not work on the dictionary stack. begin pushes a dictionary onto the dictionary stack. end pops off the top dictionary from the dictionary stack.Defecate
D
9

The other answers already covered the "What are dictionaries?" part of your question. Now let's turn to "How can Ghostscript access them?"

Maybe the question should rather be: "How can I (a power user, a developer, a geek...) access them?"

You can print out the contents of any accessible dictionary that's known to your PostScript interpreter (which may be Ghostscript) by writing a simple PostScript program one-liner -- or by simply calling the interpreter (Ghostscript) with the program code handed over on the commandline (-c ...).

You only need to know the name of the respective dictionary for this.

Let's look at one interesting such internal Ghostscript dictionary, called .distillersettings:

gs \
 -dNODISPLAY \
 -c ".distillersettings {exch ==only ( ) print ==} forall quit"

Result:

/default -dict-
/prepress -dict-
/PSL2Printer -dict-
/ebook -dict-
/screen -dict-
/printer -dict-

This may not tell you much at the first glance. But you may recognize some of the key names in that dictionary: /prepress, /printer, /screen, /ebook...

All of these you can use on a Ghostscript commandline to ask for a pre-defined set of settings when you want output made by -sDEVICE=pdfwrite (the Ghostscript 'Distiller'-alike functionality). To ask for such a set of settings, simply add -dPDFSETTINGS=/printer to the commandline.

At the second glance now you'll see that the content of the .distillersettings dictionary essentially is a set of 6 more dictionaries. It is a 'dictionary of dictionaries'.

Dictionary contents are not printed out by default (not with the PostScript code above). But if you want them, you can use a Ghostscript-specific procedure called === instead of the standard PostScript language operator == in the above command. This procedure behaves the same as == execpt that it also expands the dictionaries and prints all key:value pairs contained in them.

Be carefull with that === procedure: the -dict- you're trying to expand may be veeeeeery long and could cause you to loose your eyesight. :-)

In our current case however it is still manageable:

gs \
 -dNODISPLAY \
 -c ".distillersettings {exch ==only ( ) print ===} forall quit"

Output now is:

 /default << /Optimize false /DoThumbnails false /PreserveEPSInfo true /ColorConversionStrategy /LeaveColorUnchanged /DownsampleMonoImages false /EmbedAllFonts true /CannotEmbedFontPolicy /Warning /PreserveOPIComments true /GrayACSImageDict << /HSamples [2 1 1 2] /VSamples [2 1 1 2] /QFactor 0.9 /Blend 1 >> /DownsampleColorImages false /PreserveOverprintSettings true /CreateJobTicket false /AutoRotatePages /PageByPage /NeverEmbed [/Courier /Courier-Bold /Courier-Oblique /Courier-BoldOblique /Helvetica /Helvetica-Bold /Helvetica-Oblique /Helvetica-BoldOblique /Times-Roman /Times-Bold /Times-Italic /Times-BoldItalic /Symbol /ZapfDingbats] /ColorACSImageDict << /HSamples [2 1 1 2] /VSamples [2 1 1 2] /QFactor 0.9 /Blend 1 >> /DownsampleGrayImages false /UCRandBGInfo /Preserve >>
 /prepress << /DoThumbnails true /MonoImageResolution 1200 /ColorImageDownsampleType /Bicubic /PreserveEPSInfo true /ColorConversionStrategy /LeaveColorUnchanged /GrayImageDownsampleType /Bicubic /EmbedAllFonts true /CannotEmbedFontPolicy /Error /PreserveOPIComments true /GrayImageResolution 300 /GrayACSImageDict << /ColorTransform 1 /QFactor 0.15 /Blend 1 /HSamples [1 1 1 1] /VSamples [1 1 1 1] >> /ColorImageResolution 300 /PreserveOverprintSettings true /CreateJobTicket true /AutoRotatePages /None /MonoImageDownsampleType /Bicubic /NeverEmbed [] /ColorACSImageDict << /ColorTransform 1 /QFactor 0.15 /Blend 1 /HSamples [1 1 1 1] /VSamples [1 1 1 1] >> /CompatibilityLevel 1.4 /UCRandBGInfo /Preserve >>
 /PSL2Printer << /DoThumbnails false /CompatibilityLevel 1.2 /TransferFunctionInfo /Preserve /MonoImageResolution 1200 /PreserveEPSInfo true /CompressFonts true /ColorImageDownsampleType /Bicubic /GrayImageDownsampleType /Bicubic /ColorConversionStrategy /LeaveColorUnchanged /EmbedAllFonts true /ColorACSImageDict << /ColorTransform 1 /QFactor 0.15 /Blend 1 /HSamples [1 1 1 1] /VSamples [1 1 1 1] >> /CannotEmbedFontPolicy /Error /PreserveOPIComments true /CompressPages true /GrayImageResolution 600 /GrayACSImageDict << /ColorTransform 1 /QFactor 0.15 /Blend 1 /HSamples [1 1 1 1] /VSamples [1 1 1 1] >> /ColorImageResolution 600 /PreserveOverprintSettings true /AutoRotatePages /None /MonoImageDownsampleType /Bicubic /ASCII85EncodePages true /MaxViewerMemorySize 8000000 /NeverEmbed [] /PreserveHalftoneInfo true /UCRandBGInfo /Preserve >>
 /ebook << /DoThumbnails false /MonoImageResolution 300 /ColorImageDownsampleType /Bicubic /PreserveEPSInfo false /ColorConversionStrategy /sRGB /GrayImageDownsampleType /Bicubic /EmbedAllFonts true /CannotEmbedFontPolicy /Warning /PreserveOPIComments false /GrayImageResolution 150 /GrayACSImageDict << /ColorTransform 1 /QFactor 0.76 /Blend 1 /HSamples [2 1 1 2] /VSamples [2 1 1 2] >> /ColorImageResolution 150 /PreserveOverprintSettings false /CreateJobTicket false /AutoRotatePages /All /MonoImageDownsampleType /Bicubic /NeverEmbed [/Courier /Courier-Bold /Courier-Oblique /Courier-BoldOblique /Helvetica /Helvetica-Bold /Helvetica-Oblique /Helvetica-BoldOblique /Times-Roman /Times-Bold /Times-Italic /Times-BoldItalic /Symbol /ZapfDingbats] /ColorACSImageDict << /ColorTransform 1 /QFactor 0.76 /Blend 1 /HSamples [2 1 1 2] /VSamples [2 1 1 2] >> /CompatibilityLevel 1.4 /UCRandBGInfo /Remove >>
 /screen << /DoThumbnails false /MonoImageResolution 300 /ColorImageDownsampleType /Average /PreserveEPSInfo false /ColorConversionStrategy /sRGB /GrayImageDownsampleType /Average /EmbedAllFonts true /CannotEmbedFontPolicy /Warning /PreserveOPIComments false /GrayImageResolution 72 /GrayACSImageDict << /ColorTransform 1 /QFactor 0.76 /Blend 1 /HSamples [2 1 1 2] /VSamples [2 1 1 2] >> /ColorImageResolution 72 /PreserveOverprintSettings false /CreateJobTicket false /AutoRotatePages /PageByPage /MonoImageDownsampleType /Average /NeverEmbed [/Courier /Courier-Bold /Courier-Oblique /Courier-BoldOblique /Helvetica /Helvetica-Bold /Helvetica-Oblique /Helvetica-BoldOblique /Times-Roman /Times-Bold /Times-Italic /Times-BoldItalic /Symbol /ZapfDingbats] /ColorACSImageDict << /ColorTransform 1 /QFactor 0.76 /Blend 1 /HSamples [2 1 1 2] /VSamples [2 1 1 2] >> /CompatibilityLevel 1.3 /UCRandBGInfo /Remove >>
 /printer << /DoThumbnails false /MonoImageResolution 1200 /ColorImageDownsampleType /Bicubic /PreserveEPSInfo true /ColorConversionStrategy /UseDeviceIndependentColor /GrayImageDownsampleType /Bicubic /EmbedAllFonts true /CannotEmbedFontPolicy /Warning /PreserveOPIComments true /GrayImageResolution 300 /GrayACSImageDict << /ColorTransform 1 /QFactor 0.4 /Blend 1 /HSamples [1 1 1 1] /VSamples [1 1 1 1] >> /ColorImageResolution 300 /PreserveOverprintSettings true /CreateJobTicket true /AutoRotatePages /None /MonoImageDownsampleType /Bicubic /NeverEmbed [] /ColorACSImageDict << /ColorTransform 1 /QFactor 0.4 /Blend 1 /HSamples [1 1 1 1] /VSamples [1 1 1 1] >> /CompatibilityLevel 1.4 /UCRandBGInfo /Preserve >>

Still not so nice. So let's try to get it better. The way we can do it is to modify our PostScript code: we tell it now to access the .distillersettings dictionary and get the value of one of the keys from it (let's use /screen). Since we know that the value is another dictionary, we know we'll get another set of key:value pairs which we will be able to format the same way we did before:

gs \
 -q \
 -dNODISPLAY \
 -c ".distillersettings /screen get {exch ==only ( ) print ===} forall quit"

Now this looks nicer, doesn't it? See yourself:

/DoThumbnails false
/MonoImageResolution 300
/ColorImageDownsampleType /Average
/PreserveEPSInfo false
/ColorConversionStrategy /sRGB
/GrayImageDownsampleType /Average
/EmbedAllFonts true
/CannotEmbedFontPolicy /Warning
/PreserveOPIComments false
/GrayImageResolution 72
/GrayACSImageDict -dict-
/ColorImageResolution 72
/PreserveOverprintSettings false
/CreateJobTicket false
/AutoRotatePages /PageByPage
/MonoImageDownsampleType /Average
/NeverEmbed [/Courier /Courier-Bold /Courier-Oblique /Courier-BoldOblique /Helvetica     /Helvetica-Bold /Helvetica-Oblique /Helvetica-BoldOblique /Times-Roman /Times-Bold /Times-Italic /Times-BoldItalic /Symbol /ZapfDingbats]
/ColorACSImageDict -dict-
/CompatibilityLevel 1.3
/UCRandBGInfo /Remove

As your sharp eye may have spotted already: some of the key values are again dictionaries. You are free to use the above command again, this time with a === in place of the second == to resolve the mysteries that /GrayACSImageDict -dict- and the like may keep hiding...

In any case, now you know what you save on typing by simply using -dPDFSETTINGS=/screen instead of enumerating all the single parameters embedded in this /screen dictionary...

And you also know what single value you need to override should you want the general 'screen' quality output, but with the difference that all fonts get embedded:

gs \
 -o out.pdf \
 -sDEVICE=pdfwrite \
 -dPDFSETTINGS=/screen \
 -c "<</NeverEmbed [ ] /AlwaysEmbed [/Courier /Courier-Bold /Courier-Oblique /Courier-BoldOblique /Helvetica /Helvetica-Bold /Helvetica-Oblique /Helvetica-BoldOblique /Times-Roman /Times-Bold /Times-Italic /Times-BoldItalic /Symbol /ZapfDingbats]>> setdistillerparams" \
 -f input.pdf

You can explore a lot of interesting things this way about Ghostscript internals, if only you know the name of the dictionaries it uses. :-)

Designer answered 21/6, 2012 at 15:43 Comment(1)
Fantastic, @pipitas - now this is something I've wanted to know for quite a while; especially what == means, and how to use it; the warning about long dictionaries is also appreciated - well, the entire post is! Edited title - and if you could just link this answer at the end of the accepted one, it would be great! Many, many thanks for this - cheers!Levigate
T
7

Lots of good answers already, but nobody's mentioned this:

When invoking ghostscript, the -d and -s options create initial definitions in systemdict. This allows you to do parameterized invocation of your postscript program.

Use -dname[=token] to set the value to null, or a number (or any other single postscript token). Use -sname=string to set a string value (which in most contexts works as well as a name).

And you can manipulate all the stacks to some extent with the right operators.

  • token push to operand stack from string or file (this is what the interpreter loop uses to consume the program stream, so this is what you're using whether inputting code through a file or straight from the keyboard)
  • pop discard from operand stack
  • begin push to dict stack
  • end pop from dict stack
  • run, exec, %procedure-invocation push to exec stack
  • exit, stop pop or clear exec stack
  • gsave push gstate on graphics stack
  • grestore pop graphics stack
  • save push a copy of all VM-contents (all dicts and arrays, but not strings)
  • restore rewind memory to saved state (revert all dicts and arrays to previous state)

Dictionaries, being composite objects, inherit a number of operators common to all composite objects.

  • -typename- create object, eg dict
  • length report size of object
  • put insert an element
  • get retrieve an element
  • copy populate an object with contents from another object
  • forall do something to each element
  • *load alternate retrieve element (for dictionaries, load performs a search with where and then a get; for arrays, aload spills the entire contents of the array on the operand stack)
  • *store alternate insert element (for dictionaries, store performs a search with where and then a put if found, or def if not; for arrays, astore fills the array from objects on the stack)

To this suite, dictionaries add

  • def put into current dictionary (top of dict stack)
  • known query dictionary for an element
  • where query all dictionaries for element
  • maxlength no longer interesting after PS Level 2 added auto-expanding dictionaries and gc
  • dictstack copy the dictstack into an array (maybe you wanna search bottom-up, you can!)
  • names not preceded by a slash / are automatically loaded and, if executable, executed
  • // while token is constructing a postscript object any names preceded by a double-slash are loaded and substituted into the procedure array. This is very powerful, as you can mimic Lisp macros.

Edit: One more thing. When creating a dictionary there is a time/space trade-off when you choose the size for the dictionary. Dictionaries are almost certainly implemented as a hash-table (in all but the simplest interpreters), and most hash-functions can avoid collisions when the table is about half-full (Rule of Thumb: Use double-sized dicts for speed). Since level-2, of course, dictionaries will grow automatically when you add size+1 elements, presumably by allocating a new dictionary of k*size (where k is probably 1.5 or 2); but controlling sizes manually can give you a speed boost. In level-1, if you're not multiply-referencing your dictionaries, you can install a replacement for dictfull in errordict to grow the dict and re-execute the put (or def or whatever). Since level-2 does this internally, it can replace all references.

Tuberculosis answered 6/7, 2012 at 0:7 Comment(0)
D
4

If you want to obtain a list of other dictionaries which are contained in the systemdict and userdict dictionaries, just run:

for _dict in userdict systemdict; \
   do \
   gs \
     -dNODISPLAY \
     -c "${_dict} {exch ==only ( ) print ==} forall quit"; \
done \
| awk '{print $1, $2}' \
| grep -- -dict- \
| sort

This will produce a sorted list of dictionary names which you could investigate for potentially 'interesting' names.

You'll find such names like Fontmap, localdict, AdobeGlyphList, userparams, .eexec_param_dict, .substitutefamilies, EncodingDirectory, colorspacedict, .distillerparamkeys, devicedict, .symbol_list, ...

With each of these names you can look up more or less interesting info and tidbits about Ghostscript's internals by running f.e.:

gs \
  -q \
  -dNODISPLAY \
  -c "Fontmap {exch ==only ( ) print ==} forall quit"

As you can see, even the Fontmap used by Ghostscript is stored in a dictionary. An extract of my results here locally is this:

[....]
/Arial [/ArialMT]
/Arial,Bold [/Arial-BoldMT]
/AvantGarde-Book [/URWGothicL-Book]
/Bookman-Demi [/URWBookmanL-DemiBold]
/Calligraphic-Hiragana [(fhirw.gsf)]
/Calligraphic-Katakana [(fkarw.gsf)]
/Charter-Bold [/CharterBT-Bold]
/CharterBT-Bold [(bchb.pfa)]
/Courier [/NimbusMonL-Regu]
/Courier-Bold [/NimbusMonL-Bold]
/Courier-BoldOblique [/NimbusMonL-BoldObli]
/Courier-Oblique [/NimbusMonL-ReguObli]
/Helvetica [/NimbusSanL-Regu]
/Helvetica-Bold [/NimbusSanL-Bold]
/NewCenturySchlbk-Bold [/CenturySchL-Bold]
/Palatino-Roman [/URWPalladioL-Roma]
/Symbol [/StandardSymL]
/Times-Bold [/NimbusRomNo9L-Medi]
/TimesNewRoman,Bold [/TimesNewRomanPS-BoldMT]
/Utopia-Regular [(putr.pfa)]
/ZapfDingbats [/Dingbats]
[....]

Note of Caution: The above is not actually the file format as you'll have to use when you want to manipulate the Fontmap file that Ghostscript should use (in general, or for a particular job). For that format, please read the comments in an example Fontmap file as shipped by Ghostscript. The above list is the fontmap representation as Ghostscript stores in its internal dictionary.

Designer answered 21/6, 2012 at 18:21 Comment(0)
S
2

Dictionaires in PostScript are a 'container' object, they are in essence a list of pairs, a key and a value. See the PostScript Language Reference Manual for more details, especially section 3.3.9 in the third edition.

Dictionaries are often used to pass a set of parameters to a PostScript operator or function, for example the image operator can take a dictionary argument but they can equally well simply be storage.

Dictionaries can have access permissions, so it is possible to have read-only dictionaries, whose values can be examined, but not modified, and font dictionaries can be 'no access' to prevent the outline data being extracted in PostScript.

entries in a dictionary which is not read-only or no-access may be modified at will.

Septennial answered 21/6, 2012 at 13:0 Comment(1)
Many thanks for the answer, @KenS; here is also a link to the current PostScript Language Reference Manual (PLRM.pdf) - cheers!Levigate

© 2022 - 2024 — McMap. All rights reserved.