How do I deal with characters unsupported by the font file when using imagettftext()?
Asked Answered
B

4

6

I use Verdana font inside my images created by PHP GD library.

imagettftext($image, $fontSize, 0, 70, $y, $color, $font, $username );

Most of the cases imagettftext works very well for strings.
But some of my users use weird characters/symbols inside their names.
So when I try to print their names to images. For example:
enter image description here

This user uses ɦɪɲɣƙƨєʌɾ symbols. So Verdana can't print them.

I used this:

$username=iconv('UTF-8', 'ASCII//TRANSLIT', $username);

Output is this:
enter image description here

(Current locale changes between English and Deutsch. So maybe current locale can't handle these characters: ɦɪɲɣƙƨєʌɾ)

It seems like it's not possible to transliterate ɦ to h, ɲ to n without writing a very big str_replace() block. Like this.

  • So I wonder whether is it possible to check whether the font (Verdana) can show these symbols. If one of the character can't be shown inside string, so I can pass an empty string to imagettftext method. Can I check the supported characters inside font ? Or create a character map that includes Verdana supported symbols, and check whether my string includes non-supported symbols ?
    (I think it is not possible due to this question)

  • Or maybe another solution, is it possible to use multiple fonts in imagettftext() ?
    For example first try Verdana, if Verdana doesn't cover that symbols use Arial sans serif etc.

  • Or any other solution ?

Edit:
It seems like Verdana doesn't support these unicode characters in my text.
Verdana supported characters: http://www.fileformat.info/info/unicode/font/verdana/grid.htm
Verdana unspported characters: http://www.fileformat.info/info/unicode/font/verdana/missing.htm

Brusa answered 8/3, 2014 at 16:2 Comment(3)
I edited the title and tags, feel free to change if you don't like it.Tussah
Are you 100% sure the incoming data is UTF-8? Because if it isn't, there's no way iconv() can transliterate the characters for you. You have to make sure you specify the correct encoding in the first parameter. (That said, I'm not sure whether ɦ should transliterate to h in the first place, so maybe this is the expected result.)Tussah
I'm sure about UTF8. Even without iconv, it would be good to know whether Verdana can print it. So I can print a dummy word if font not suitable.Brusa
M
4

My first choice would be switching to a font that supports the full range of characters that you want to be able to handle. But do not expect a single font will ever implement the million-or-so possible characters in UTF-8.

Now, if you want to take the (lazy ;) transliteration route, I will refer to this answer from Kemal Dağ:

I don't have a v5.4 on hand right now so I cannot tell about Transliterator, but Kemal Dağ's port of JTransliteration performs pretty well:

<?php
    require 'transliteration/JTransliteration.php';

    $input = 'ɦɪɲɣƙƨєʌɾ';
    echo JTransliteration::transliterate($input); // output: hIngk2ie^r

    $input = 'Хეλлఒ Wओრলद';
    echo JTransliteration::transliterate($input);

Finally, if you want to check wether a given font supports a given character, it gets a bit more hairy. This library will help a lot. It requires >= 5.3 (use of namespaces):

<?php
    $fontFile = 'arial.ttf';
    $charToCheck = 'ɣ';

    require_once 'php-font-lib-master/src/FontLib/Autoloader.php';

    use FontLib\Font;
    use FontLib\TrueType\Collection;


    $font = Font::load($fontFile);
    if ($font instanceof Collection) {
        $font = $font->getFont(0);
    }
    $subtable = null;
    foreach($font->getData("cmap", "subtables") as $_subtable) {
        if ($_subtable["platformID"] == 3 && $_subtable["platformSpecificID"] == 1) {
            $subtable = $_subtable;
            break;
        }
    }

    if (isset($subtable["glyphIndexArray"][ord_utf8($charToCheck)])) {
        $supported = 'Supported';
    } else {
        $supported = 'Not Supported';
    }

    echo "$charToCheck is $supported by font $fontFile";


    function ord_utf8($c) {
        $b0 = ord($c[0]);
        if ( $b0 < 0x10 ) {
            return $b0;
        }
        $b1 = ord($c[1]);
        if ( $b0 < 0xE0 ) {
            return (($b0 & 0x1F) << 6) + ($b1 & 0x3F);
        }
        return (($b0 & 0x0F) << 12) + (($b1 & 0x3F) << 6) + (ord($c[2]) & 0x3F);
    }

shamelessly pillaging code from font_info.php and R. Hill's ord_utf8()

P.S. The string "ɦɪɲɣƙƨєʌɾ" is made of characters from the International Phonetic Alphabet. I am not sure any locale supports these characters (since there is no practical need for it, as they are not used by any real human language).

Mauricemauricio answered 11/3, 2014 at 19:50 Comment(0)
F
3

As long as you're using UTF-8, there is no reason for UTF-8 True Type font to show those letters (disclaimer for east-asian letters!)

Here my simple example, with a true type font:

// utf-8 text
$text   = 'ɦɪɲɣƙƨєʌɾ';

// if text read from a file (for example)
// and the default locale (for most of western countries)
// is ISO-8859-1, you can simly convert it to
// utf-8 using:

//$text = utf8_encode($text);

$png    = imagecreatefrompng('/tmp/sample.png');
$color  = imagecolorallocate($png, 0, 0, 0);

// True type font that support UTF-8!!!!
$font   = '/usr/share/fonts/truetype/dejavu/DejaVuSans.ttf';

imagettftext($png, 50, 0, 50, 50, $color, $font, $text);
imagepng($png, '/tmp/test.png');

And the results:

enter image description here

Firelock answered 17/3, 2014 at 12:10 Comment(2)
I also checked dejavu. It seems like dejavu supports these weird characters. And most of the user's names can be shown with Dejavu. But I don't want to lose Verdana... When I check string I always see that it is UTF8 encoded. I'm thinking a composite solution that will use Verdana for safe characters and use Verdana for these Verdana characters.Brusa
You'll probably have to choose either keep using font that does not support unicode and add some complexity and limitation to your code, or-decide to find a font that keep your code simple. Here you'll find more information about unicode and fonts: unicode.org/faq/basic_q.html Here you'll find list of fonts and resources for unicode: unicode.org/resources/fonts.html And the following links are for what in and out of unicode chars in verdana: fileformat.info/info/unicode/font/verdana/list.htm fileformat.info/info/unicode/font/verdana/missing.htmFirelock
L
0

have you set correct locale? For iconv can be necesary - https://www.php.net/manual/en/function.iconv.php#74101

Lynn answered 9/3, 2014 at 13:40 Comment(1)
Current locale is English or Deutsch. I can't predict that ɦɪɲɣƙƨєʌɾ exists in which locale. So I think it is not possible to use iconv in this situation.Brusa
W
0

The problem you describe has multiple places where it can fail and which are important for you to know about first to do the right decision on how to solve this best.

Because so many things can go wrong, you need to fail early in case the input is not as expected. So first of all, you must validate that the string is in the right encoding to be used with imagettftext() before calling that function:

if (!preg_match('//u', $username)) {
    throw new Exception(
        sprintf(
            "Username string %s can not be used with imagettftext()"
            , var_export($username, true)
        )
    );
}

Not doing this will not get you the right results in the first place. Then if this check fails, the solution to pass this is to ensure that the string is UTF-8 encoded. This is more or less a sanity check as you say the string is already UTF-8 encoded, so it should pass already. However, just in case you did some mistake with the encoding and it is not valid (can happen easily), this check prevents you from looking in the wrong place.

As the output you've put into your question shows already, you most likely did a mistake with the encoding because otherwise the supported characters would be displayed correctly at least, however not only are some characters left out, it's even that different characters are shown. A clear sign for a wrong encoding:

enter image description here

So do not skip this step to actually verify the required encoding of the string.

This is especially important for the next thing:

You need to ensure is that the fontface supports the letters in that string. The Verdana font supports 794 Unicode characters (full list). If the characters you're looking for are not part of it, the imagettftext() function can not display them because the font lacks them. In that case you need to choose a font that supports the Unicode characters you're looking for instead. An overview table with different fonts is available at Wikipedia:

More guidance on the right font selection can be found here on Stackoverflow:

If you use the right encoding in the string variable and a font having a glyph for all Unicode characters encoded in that string, imagettftext does cover your needs.

As I wrote in the beginning that there are many places where it can go wrong: If you pass the encoding check and the font supports all characters already, then here is an additional place to fail: The string is UTF-8 encoded but it does not contain the characters you think.

Washtub answered 16/3, 2014 at 13:51 Comment(1)
I rechecked with mb_detect_encoding and preg_match. I saw that all weird names are UTF8 encoded. But it seems like Verdana doesn't support them. But Dejavu supports these weird characters. (en.wikipedia.org/wiki/DejaVu_fonts) But I can't figure out whether Verdana can support any new texts.Brusa

© 2022 - 2024 — McMap. All rights reserved.