Flying Saucer: Chinese character rendered as a box in PDF
Asked Answered
R

1

8

When I use Flying Saucer to convert html page with Chinese character. The Chinese character displayed as a box like below

enter image description here

I have tried both methods: using the css as in this answer Flying Saucer font for unicode characters and using the code as in this answer Flying Saucer iTextPDF Chinese Fonts, but they did not work. Does anyone have another suggestion?

I have declared the UTF-8 charset in meta tag as below:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html language="en">
<head>
<meta content="text/html; charset=UTF-8" http-equiv="content-type">   </meta>
<link rel="stylesheet" type="text/css" href="file:///opt/template/employer.css"/>
<link rel="stylesheet" type="text/css" href="file:///opt/template/style.css"/>
<link rel="stylesheet" type="text/css" media="print" href="file:///opt/template/print.css"/>
</head>

Here is the relevant section with the chinese characters:

<tbody><tr>
                                        <td align="left" width="150" valign="top">
                                            Name
                                        </td>
                                        <td align="left" width="305" valign="top">
                                            <label id="candidateName">VU DINH THE / 你好</label>
                                        </td>
                                    </tr>
                                    <tr>
                                        <td align="left" width="150" valign="top">
                                            Gender/Status
                                        </td>
                                        <td align="left" width="305" valign="top">
                                            <label id="gender">Female</label> / <label id="status">Single
</label>
                                        </td>
                                    </tr>
                                    <tr>
                                        <td align="left" width="150" valign="top">
                                            Date of Birth/Age
                                        </td>
                                        <td align="left" width="305" valign="top">
                                            <label id="dob">12 Sep 1985</label> / <label id="age">30</label>
                                        </td>
                                    </tr>

And the content of print.css:

@font-face {
    font-family: Arial Unicode MS;
    src: url('file:///opt/template/arialuni.ttf');
    -fs-pdf-font-embed: embed;
    -fs-pdf-font-encoding: Identity-H;
}
Rattlebrain answered 14/8, 2015 at 6:53 Comment(5)
Can you add the HTML code you're trying to transform, including the chinese character that doesn't render ?Frenetic
@obourgain, i have added the relevant html parts and the cssRattlebrain
The code seems correct, it works fine on my PC. The problem may come from the arialuni.ttf file. What is the size of the file ?Frenetic
@obourgain, it is 1588364Rattlebrain
@obourgain, you are right. I replaced the file downloaded from this code.google.com/p/ipwn/downloads/detail?name=arialuni.ttf with the ARIALUNI.TTF from this rmhoist.com/downloads/font and it works now. Thank you very much. Please post your answer so I can accept.Rattlebrain
F
5

Replacement of a character by an empty square or rectangle usually means that the character is not defined in the font file, and the system doesn't find information to draw it.

In this case, the HTML and CSS code is correct, but the arialuni.ttf file is incomplete.

For reference, the arialuni.ttf should be ~23 MB.

Frenetic answered 19/8, 2015 at 11:6 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.