Why does a Latin-characters-only Java font claim to support Asian characters, even though it does not?
Asked Answered
C

2

7

When rendering a chart with JFreeChart, I noticed a layout problem when the chart's category labels included Japanese characters. Although the text is rendered with the correct glyphs, the text was positioned in the wrong location, presumably because the font metrics were wrong.

The chart was originally configured to use the Source Sans Pro Regular font for that text, which supports only Latin character sets. The obvious solution is to bundle an actual Japanese .TTF font and ask JFreeChart to use it. This works fine, in that the output text uses the correct glyphs and it is also laid out correctly.

My questions

  • How did java.awt end up rendering the Japanese characters correctly in the first scenario, when using a source font that doesn't actually support anything except Latin characters? If it matters, I am testing on OS X 10.9 with JDK 1.7u45.

  • Is there any way to render the Japanese characters without bundling a separate Japanese font? (This is my end goal!) Although the bundling solution works, I don't want to add 6 Mb of bloat to my application if it can be avoided. Java clearly knows how to render the Japanese glyphs somehow even without the font (at least in my local environment)--it's seemingly just the metrics that are busted. I am wondering if this is related to the "frankenfont" issue below.

  • After the JRE performs an internal transformation, why does the Source Sans Pro font tell the caller (via canDisplayUpTo()) that it can display Japanese characters even though it cannot? (See below.)

Edited to clarify:

  • This is a server app, and the text we are rendering will show up in the client's browser and/or in PDF exports. The charts are always rasterized to PNGs on the server.

  • I have no control over the server OS or environment, and as nice as it would be to use the Java-standard platform fonts, many platforms have poor font choices that are unacceptable in my use case, so I need to bundle my own (at least for the Latin fonts). Using a platform font for the Japanese text is acceptable.

  • The app can potentially be asked to display a mix of Japanese and Latin text, without no a priori knowledge of the text type. I am ambivalent about what fonts get used if a string contains mixed languages, so long as the glyphs are rendered correctly.

Details

I understand that java.awt.Font#TextLayout is smart, and that when trying to lay out text, it first asks the underlying fonts whether they can actually render the supplied characters. If not, it presumably swaps in a different font that knows how to render those characters, but this is not happening here, based on my debugging pretty far into the JRE classes. TextLayout#singleFont always returns a non-null value for the font and it proceeds through the fastInit() part of the constructor.

One very curious note is that the Source Sans Pro font somehow gets coerced into telling the caller that it does know how to render Japanese characters after the JRE performs a transformation on the font.

For example:

// We load our font here (download from the first link above in the question)

File fontFile = new File("/tmp/source-sans-pro.regular.ttf");
Font font = Font.createFont(Font.TRUETYPE_FONT, new FileInputStream(fontFile));
GraphicsEnvironment.getLocalGraphicsEnvironment().registerFont(font);

// Here is some Japanese text that we want to display
String str = "クローズ";

// Should say that the font cannot display any of these characters (return code = 0)

System.out.println("Font " + font.getName() + " can display up to: " + font.canDisplayUpTo(str));

// But after doing this magic manipulation, the font claims that it can display the
// entire string (return code = -1)

AttributedString as = new AttributedString(str, font.getAttributes());
Map<AttributedCharacterIterator.Attribute,Object> attributes = as.getIterator().getAttributes();
Font newFont = Font.getFont(attributes);

// Eeek, -1!    
System.out.println("Font " + newFont.getName() + " can display up to: " + newFont.canDisplayUpTo(str));

The output of this is:

Font Source Sans Pro can display up to: 0
Font Source Sans Pro can display up to: -1

Note that the three lines of "magic manipulation" mentioned above are not something of my own doing; we pass in the true source font object to JFreeChart, but it gets munged by the JRE when drawing the glyphs, which is what the three lines of "magic manipulation" code above replicates. The manipulation shown above is the functional equivalent of what happens in the following sequence of calls:

  1. org.jfree.text.TextUtilities#drawRotatedString
  2. sun.java2d.SunGraphics2D#drawString
  3. java.awt.font.TextLayout#(constructor)
  4. java.awt.font.TextLayout#singleFont

When we call Font.getFont() in the last line of the "magic" manipulation, we still get a Source Sans Pro font back, but the underlying font's font2D field is different than the original font, and this single font now claims that it knows how to render the entire string. Why? It appears that Java is giving us back some sort of "frankenfont" that knows how to render all kinds of glyphs, even though it only understands the metrics for the glyphs that are supplied in the underlying source font.

A more complete example showing the JFreeChart rendering example is here, based off one of the JFreeChart examples: https://gist.github.com/sdudley/b710fd384e495e7f1439 The output from this example is shown below.

Example with the Source Sans Pro font (laid out incorrectly):

enter image description here

Example with the IPA Japanese font (laid out correctly):

enter image description here

Chorion answered 26/9, 2014 at 16:8 Comment(1)
I haven't figured this out yet, but some of my sleuthing related to the "magic manipulation" yielded my answer to this other SO question. Regrettably, using that JFreeChart workaround deliberately forces TextLayout to use a single font and it doesn't even try to look for runs of non-displayable characters that need to be rendered in a different font.Chorion
C
5

I finally figured it out. There were a number of underlying causes, which was further hindered by an added dose of cross-platform variability.

JFreeChart Renders Text in the Wrong Location Because It Uses a Different Font Object

The layout problem occurred because JFreeChart was inadvertently calculating the metrics for the layout using a different Font object than the one AWT actually uses to render the font. (For reference, JFreeChart's calculation happens in org.jfree.text#getTextBounds.)

The reason for the different Font object is a result of the implicit "magic manipulation" mentioned in the question, which is performed inside of java.awt.font.TextLayout#singleFont.

Those three lines of magic manipulation can be condensed to just this:

font = Font.getFont(font.getAttributes())

In English, this asks the font manager to give us a new Font object based on the "attributes" (name, family, point size, etc) of the supplied font. Under certain circumstances, the Font it gives back to you will be different from the Font you originally started with.

To correct the metrics (and thus fix the layout), the fix is to run the one-liner above on your own Font object before setting the font in JFreeChart objects.

After doing this, the layout worked fine for me, as did the Japanese characters. It should fix the layout for you too, although it may not show the Japanese characters correctly for you. Read below about native fonts to understand why.

The Mac OS X Font Manager Prefers to Return Native Fonts Even If You Feed it a Physical TTF File

The layout of the text was fixed by the above change...but why does this happen? Under what circumstances would the FontManager actually give us back a different type of Font object than the one we provided?

There are many reasons, but at least on Mac OS X, the reason related to the problem is that the font manager seems to prefer to return native fonts whenever possible.

In other words, if you create a new font from a physical TTF font named "Foobar" using Font.createFont, and then call Font.getFont() with attributes derived from your "Foobar" physical font...so long as OS X already has a Foobar font installed, the font manager will give you back a CFont object rather than the TrueTypeFont object you were expecting. This seems to hold true even if you register the font through GraphicsEnvironment.getLocalGraphicsEnvironment().registerFont.

In my case, this threw a red herring into the investigation: I already had the "Source Sans" font installed on my Mac, which meant that I was getting different results from people who did not.

Mac OS X Native Fonts Always Support Asian Characters

The crux of the matter is that Mac OS X CFont objects always support Asian character sets. I am unclear of the exact mechanism that allows this, but I suspect that it's some sort of fallback font feature of OS X itself and not Java. In either case, a CFont always claims to (and is truly able to) render Asian characters with the correct glyphs.

This makes clear the mechanism that allowed the original problem to occur:

  • we created a physical Font from a physical TTF file, which does not itself support Japanese.
  • the same physical font as above was also installed in my Mac OS X Font Book
  • when calculating the layout of the chart, JFreeChart asked the physical Font object for the metrics of the Japanese text. The physical Font could not do this correctly because it does not support Asian character sets.
  • when actually drawing the chart, the magic manipulation in TextLayout#singleFont caused it to obtain a CFont object and draw the glyph using the same-named native font, versus the physical TrueTypeFont. Thus, the glyphs were correct but they were not positioned properly.

You Will Get Different Results Depending on Whether You Registered the Font and Whether You Have The Font Installed in Your OS

If you call Font.getFont() with the attributes from a created TTF font, you will get one of three different results, depending on whether the font is registered and whether you have the same font installed natively:

  • If you do have a native platform font installed with the same name as your TTF font (regardless of whether you registered the font or not), you will get an Asian-supporting CFont for the font you wanted.
  • If you registered the TTF Font in the GraphicsEnvironment but you do not have a native font of the same name, calling Font.getFont() will return a physical TrueTypeFont object back. This gives you the font you want, but you don't get Asian characters.
  • If you did not register the TTF Font and you also do not have a native font of the same name, calling Font.getFont() return an Asian-supporting CFont, but it will not be the font you requested.

In hindsight, none of this is entirely surprising. Leading to:

I Was Inadvertently Using the Wrong Font

In the production app, I was creating a font, but I forgot to initially register it with the GraphicsEnvironment. If you haven't registered a font when you perform the magic manipulation above, Font.getFont() doesn't know how to retrieve it and you get a backup font instead. Oops.

On Windows, Mac and Linux, this backup font generally seems to be Dialog, which is a logical (composite) font that supports Asian characters. At least in Java 7u72, the Dialog font defaults to the following fonts for Western alphabets:

  • Mac: Lucida Grande
  • Linux (CentOS): Lucida Sans
  • Windows: Arial

This mistake was actually a good thing for our Asian users, because it meant that their character sets rendered as expected with the logical font...although the Western users were not getting the character sets that we wanted.

Since it had been rendering in the wrong fonts and we needed to fix the Japanese layout anyway, I decided that I would be better off trying to standardize on one single common font for future releases (and thus coming closer to trashgod's suggestions).

Additionally, the app has font rendering quality requirements that may not always permit the use of certain fonts, so a reasonable decision seemed to be to try to configure the app to use Lucida Sans, which is the one physical font that is included by Oracle in all copies of Java. But...

Lucida Sans Doesn't Play Well with Asian Characters on All Platforms

The decision to try using Lucida Sans seemed reasonable...but I quickly found out that there are platform differences in how Lucida Sans is handled. On Linux and Windows, if you ask for a copy of the "Lucida Sans" font, you get a physical TrueTypeFont object. But that font doesn't support Asian characters.

The same problem holds true on Mac OS X if you request "Lucida Sans"...but if you ask for the slightly different name "LucidaSans" (note the lack of space), then you get a CFont object that supports Lucida Sans as well as Asian characters, so you can have your cake and eat it too.

On other platforms, requesting "LucidaSans" yields a copy of the standard Dialog font because there is no such font and Java is returning its default. On Linux, you are somewhat lucky here because Dialog actually defaults to Lucida Sans for Western text (and it also uses a decent fallback font for Asian characters).

This gives us a path to get (almost) the same physical font on all platforms, and which also supports Asian characters, by requesting fonts with these names:

  • Mac OS X: "LucidaSans" (yielding Lucida Sans + Asian backup fonts)
  • Linux: "Dialog" (yielding Lucida Sans + Asian backup fonts)
  • Windows: "Dialog" (yielding Arial + Asian backup fonts)

I've pored over the fonts.properties on Windows and I could not find a font sequence that defaulted to Lucida Sans, so it looks like our Windows users will need to get stuck with Arial...but at least it's not that visually different from Lucida Sans, and the Windows font rendering quality is reasonable.

Where Did Everything End Up?

In sum, we're now pretty much just using platform fonts. (I am sure that @trashgod is having a good chuckle right now!) Both Mac and Linux servers get Lucida Sans, Windows gets Arial, the rendering quality is good, and everyone is happy!

Chorion answered 21/11, 2014 at 22:27 Comment(1)
I'm always happy to see what proves to get good cross-platform results!Subscript
S
3

Although it doesn't address your question directly, I thought it might provide a useful point of reference to show the result using the platform's default font in an unadorned chart. A simplified version of BarChartDemo1, source, is shown below.

Due to the vagaries of third-party font metrics, I try to avoid deviating from the platform's standard logical fonts, which are chosen based on the platform's supported locale's. Logical fonts are mapped to physical font's in the platform's configuration files. On Mac OS, the relevant file are in $JAVA_HOME/jre/lib/, where $JAVA_HOME is result of evaluating /usr/libexec/java_home -v 1.n and n is your version. I see similar results with either version 7 or 8. In particular, fontconfig.properties.src defines the font used to supply Japanese font family variations. All mappings appear to use MS Mincho or MS Gothic.

image

import java.awt.Dimension;
import java.awt.EventQueue;
import org.jfree.chart.ChartFactory;
import org.jfree.chart.ChartPanel;
import org.jfree.chart.JFreeChart;
import org.jfree.chart.plot.PlotOrientation;
import org.jfree.data.category.CategoryDataset;
import org.jfree.data.category.DefaultCategoryDataset;
import org.jfree.ui.ApplicationFrame;
import org.jfree.ui.RefineryUtilities;

/**
 * @see https://mcmap.net/q/1501688/-why-does-a-latin-characters-only-java-font-claim-to-support-asian-characters-even-though-it-does-not
 * @see http://www.jfree.org/jfreechart/api/javadoc/src-html/org/jfree/chart/demo/BarChartDemo1.html
 */
public class BarChartDemo1 extends ApplicationFrame {

    /**
     * Creates a new demo instance.
     *
     * @param title the frame title.
     */
    public BarChartDemo1(String title) {
        super(title);
        CategoryDataset dataset = createDataset();
        JFreeChart chart = createChart(dataset);
        ChartPanel chartPanel = new ChartPanel(chart){

            @Override
            public Dimension getPreferredSize() {
                return new Dimension(600, 400);
            }
        };
        chartPanel.setFillZoomRectangle(true);
        chartPanel.setMouseWheelEnabled(true);
        setContentPane(chartPanel);
    }

    /**
     * Returns a sample dataset.
     *
     * @return The dataset.
     */
    private static CategoryDataset createDataset() {

        // row keys...
        String series1 = "First";
        String series2 = "Second";
        String series3 = "Third";

        // column keys...
        String category1 = "クローズ";
        String category2 = "クローズ";
        String category3 = "クローズクローズクローズ";
        String category4 = "Category 4 クローズ";
        String category5 = "Category 5";

        // create the dataset...
        DefaultCategoryDataset dataset = new DefaultCategoryDataset();

        dataset.addValue(1.0, series1, category1);
        dataset.addValue(4.0, series1, category2);
        dataset.addValue(3.0, series1, category3);
        dataset.addValue(5.0, series1, category4);
        dataset.addValue(5.0, series1, category5);

        dataset.addValue(5.0, series2, category1);
        dataset.addValue(7.0, series2, category2);
        dataset.addValue(6.0, series2, category3);
        dataset.addValue(8.0, series2, category4);
        dataset.addValue(4.0, series2, category5);

        dataset.addValue(4.0, series3, category1);
        dataset.addValue(3.0, series3, category2);
        dataset.addValue(2.0, series3, category3);
        dataset.addValue(3.0, series3, category4);
        dataset.addValue(6.0, series3, category5);

        return dataset;

    }

    /**
     * Creates a sample chart.
     *
     * @param dataset the dataset.
     *
     * @return The chart.
     */
    private static JFreeChart createChart(CategoryDataset dataset) {

        // create the chart...
        JFreeChart chart = ChartFactory.createBarChart(
                "Bar Chart Demo 1", // chart title
                "Category", // domain axis label
                "Value", // range axis label
                dataset, // data
                PlotOrientation.HORIZONTAL, // orientation
                true, // include legend
                true, // tooltips?
                false // URLs?
        );
        return chart;
    }

    /**
     * Starting point for the demonstration application.
     *
     * @param args ignored.
     */
    public static void main(String[] args) {
        EventQueue.invokeLater(() -> {
            BarChartDemo1 demo = new BarChartDemo1("Bar Chart Demo 1");
            demo.pack();
            RefineryUtilities.centerFrameOnScreen(demo);
            demo.setVisible(true);
        });
    }
}
Subscript answered 29/9, 2014 at 0:9 Comment(8)
Thanks for the references! I would love it if I were able to use the standard logical fonts. Unfortunately, the system fonts on certain client platforms (particularly CentOS) are ugly and don't make the cut for presentation-quality charts, at least inasmuch as Latin text is concerned. I am less concerned about the quality of the Japanese characters: if I could get it to render Latin text with a nice looking font, but fall back to the platform font for Japanese or other non-supported characters, that would be fine. I suspect that's what it's doing anyway in my example, +/- the metrics problem.Chorion
I see that the StandardChartTheme defaults to "Tahoma"; you might try Font.SANS_SERIF.Subscript
On top of the "not ugly" bit, I guess I should have also added "consistency", so platform fonts aren't going to cut it. I don't want the charts to potentially turn all janky just because the IT department switched from Windows to Linux, for example. I even tried using Lucida Sans but still found the results to be suboptimal. (I guess I should add "typeface snob" to my profile? I am also tempted to upvote your previous comment just because you used a semicolon.)Chorion
I would urge you to embrace platform UI coherence over inter-platform consistency. Let the user choose the Look & Feel and persist that choice.Subscript
I would agree with you for end user programs, but this is a server app and the output is rendered in the user's browser and/or exported to PDF (and then frequently emailed to less tech-savvy individuals). I am pretty sure that (say) modern OS X Mavericks users will not embrace the crappy platform fonts that come with some ancient Linux distro that IT just happens to have installed the software on. :-)Chorion
But yes, we will probably have to find a way to allow the user to switch to a different font manually if we can't find any other solution, but I am holding out hope for a zero-config solution.Chorion
I am happy to send the bounty to @Subscript today as this is the end of the bounty period and the answer certainly gives some useful insight. For anyone else who is reading this, I am still on the lookout for direct answers to the original questions!Chorion
Moreover, you can answer your own question as your project evolves.Subscript

© 2022 - 2024 — McMap. All rights reserved.