using Unicode 'dingbat-like' glyphs in R graphics, across devices & platforms, especially PDF

Asked 4/5, 2011 at 15:37 Answered 26/10, 2013 at 19:25

Some of you may have seen my blog post on this topic, where I wrote the following code after wanting to help a friend produce half-filled circles as points on a graph:

TestUnicode <- function(start="25a0", end="25ff", ...)
  {
    nstart <- as.hexmode(start)
    nend <- as.hexmode(end)
    r <- nstart:nend
    s <- ceiling(sqrt(length(r)))
    par(pty="s")
    plot(c(-1,(s)), c(-1,(s)), type="n", xlab="", ylab="",
         xaxs="i", yaxs="i")
    grid(s+1, s+1, lty=1)
    for(i in seq(r)) {
      try(points(i%%s, i%/%s, pch=-1*r[i],...))
    }
  }

TestUnicode(9500,9900)

This works (i.e. produces a nearly-full grid of cool dingbatty symbols):

on Ubuntu 10.04, in an X11 or PNG device
on Mandriva Linux distribution, same devices, with locally built R, once pango-devel was installed

It fails to varying degrees (i.e. produces a grid partly or entirely filled with dots or empty rectangles), either silently or with warnings:

on the same Ubuntu 10.04 machine in PDF or PostScript (tried setting font="NimbusSan" to use URW fonts, doesn't help)
on MacOS X.6 (quartz, X11, Cairo, PDF)

For example, trying all the available PDF font families:

flist <- c("AvantGarde", "Bookman","Courier", "Helvetica", "Helvetica-Narrow",
        "NewCenturySchoolbook", "Palatino", "Times","URWGothic",
        "URWBookman", "NimbusMon", "NimbusSan", "NimbusSanCond",
        "CenturySch", "URWPalladio","NimbusRom")

for (f in flist) {
  fn <- paste("utest_",f,".pdf",sep="")
  pdf(fn,family=f)
  TestUnicode()
  title(main=f)
  dev.off()
  embedFonts(fn)
}

on Ubuntu, none of these files contains the symbols.

It would be nice to get it to work on as many combinations as possible, but especially in some vector format and double-especially in PDF.

Any suggestions about font/graphics device configurations that would make this work would be welcomed.

Jespersen answered 4/5, 2011 at 15:37 Comment(0)

I think you are out of luck Ben, as, according to some notes by Paul Murrell, pdf() can only handle single-byte encodings. Multi-byte encodings need to be converted to a the single-byte equivalent, and therein lies the rub; by definition, single-byte encodings cannot contain all the glyphs that can be represented in a multi-byte encoding like UTF-8, say.

Paul's notes can be found here wherein he suggests a couple of solutions using Cairo-based PDF devices, using cairo_pdf() on suitably-endowed Linux and Mac OS systems, or via the Cairo package under MS Windows.

Passivism answered 4/5, 2011 at 15:55 Comment(0)

I have found the cairo_pdf device to be completely insufficient: the output is markedly different from both pdf and on-screen rendering, and its plotmath support is sketchy.

However, there’s a rather simple workaround on OS X: Use the “normal” quartz device and set its type to pdf:

quartz(type = 'pdf', file = 'output.pdf')

Unfortunately, on my computer this ignores the font family and always uses Helvetica (although the documentation claims that the default is Arial).

There are at least two other gotchas:

pdf converts hyphens to minuses. This may not even always be what you want but it’s quite useful to properly typeset negative numbers. The linked thread describes workarounds for this.
It’s of course platform specific and only works on OS X.

(I realise that OP briefly mentions the Quartz device but this thread is frequently viewed and I think this solution needs more prominence.)

Mensural answered 26/10, 2013 at 19:25 Comment(2)

can you be more specific/give specific examples of the problems, especially those with plotmath rendering? It would probably help to give the results of sessionInfo() too ... – Jespersen 26/10, 2013 at 21:11

@Ben Hmm, since this post isn’t really about plotmath, maybe a comment suffices. Tell me if you feel otherwise. Here goes: Cairo only implements a subset of plotmath; it doesn’t implement some of the symbols (see demo(plotmath)), some of the spaces are off, and it doesn’t support italic text (with Helvetica at least). [R v3.0.1, x86_64-apple-darwin10.8.0 in case that’s relevant, but I doubt it] – Mensural 26/10, 2013 at 22:2

Another solution might be to use tikzDevice which can now use XeLaTeX with Unicode characters. The resulting tex file can then be compiled to produce a pdf. The problem is still that you must have a font on your system that contains the characters.

library(tikzDevice)
options(tikzXelatexPackages=c(getOption('tikzXelatexPackages'),
    '\\setromanfont{Courier New}'))
tikz(engine='xetex',standAlone=T)
TestUnicode(9500,9900)
dev.off()

The first time, this will take a LONG time.

Jeaz answered 9/5, 2011 at 23:15 Comment(5)

Hmmm. Tip appreciated. I installed XeTeX (on Ubuntu, apt-get install texlive-xetex), but I don't seem to have "Courier New" on my (Ubuntu 10.04) system (or at least XeTeX can't find it: normally apt-get install runs all of the TeX updates necessary ...). Suggestions for how to guess/find an appropriate font? – Jespersen 10/5, 2011 at 13:11

fc-list | grep -i ding should show a list of fonts installed on your computer that contain the word "ding" in their names. XeTeX should be able to access these using fontspec commands. – Effusive 10/5, 2011 at 17:30

Courier New was just an arbitrary choice, as Sharpie mentioned, you will need to find a font on your system that has the symbols. – Jeaz 10/5, 2011 at 18:0

Double hmmm. I pursued this a little way because it seems like a nice alternative, and might work on some systems where the cairo_pdf() solution does not, but ... all I know in my current state of cluelessness is that cairo_pdf seems to find a suitable font automatically (at least on Mandriva or Ubuntu systems with pango installed), whereas it would take me a while to dig through and figure out how to find the appropriate fonts, translate the names to their XeTeX equivalents, etc.. – Jespersen 10/5, 2011 at 21:51

(Was running out of room in the previous comment.) The fc-list command given above returns Dingbats:style=Regular on my system. I'm not sure I know how to translate this to a valid XeTeX font specification ... naively changing Courier New to Dingbats in the code above fails. Bottom line: this seems interesting, but (given that I have a solution that works for me) probably not worth my effort pursuing at this point. Hopefully it will be useful to someone else. – Jespersen 10/5, 2011 at 21:59

Have you tried embedding a font in the PDF, or including one for Mac users that would work?

Valerlan answered 4/5, 2011 at 15:41 Comment(1)

Thanks. Can you be slightly more specific? R has an embedFonts() function, but I believe that's intended to post-process a PDF/PostScript to make sure that fonts that are present on the current system get embedded in the file. This is (I think) a different situation, where the fonts used by R for the pdf don't include the glyphs in the first place. For example, pdf("test.pdf", family="NimbusSan"); TestUnicode(); dev.off() fails. – Jespersen 4/5, 2011 at 15:47

Recommended topics

Hot tags