Where can I find a list of IDs or rules for the PHP transliterator (Intl)?
Asked Answered
H

2

4

Transliterator::listIDs() will list IDs, but apparently it's not a complete list.

In the example from this page, the ID looks like:

Any-Latin; NFD; [:Nonspacing Mark:] Remove; NFC; [:Punctuation:] Remove; Lower();

which is kind of weird, because IDs are supposed to be unique. This looks more like a rule, but it doesn't work if I pass it to the createFromRules method :)

Anyway, I'm trying to remove any punctuation from the string, except dash (-), or characters from a specific list.

Do you know if that's possible? Or is there some documentation that better explains the syntax for the transliterator ?

Hiroshima answered 19/5, 2013 at 14:41 Comment(0)
A
6

The ids that Transliterator::listIDs() are the "basic ids". The example you gave is a "compound id". You can see the ICU docs on this.

You can also create your own rules with Transliterator::createFromRules().

You can take a look at the prefefined rules:

<?php
$a = new ResourceBundle(NULL, sprintf('icudt%dl-translit', INTL_ICU_VERSION), true);

foreach ($a['RuleBasedTransliteratorIDs'] as $name => $v) {
    $file = @$v['file'];
    if (!$file) {
        $file = $v['internal'];
        echo $name, " (direction $file[direction]; internal)\n";
    } else { 
        echo $name, " (direction: $file[direction])\n";
        echo $file['resource'];
    }
    echo "\n--------------\n";
}

After formatting, the result looks like this.

Armoured answered 9/6, 2013 at 23:8 Comment(1)
friendly reminder: that's a pretty intense .txt file for machine low on memory, chrome and sublime text may stop responding handling it...Coltson
F
1

Just in case someone wants a working example. The example mentioned (from the php manual) uses procedural style. To make it work with an object oriented style, use create() instead of createFromRules()

removePunctuation($string) {
    $transliterator = Transliterator::create("Any-Latin; NFD; [:Nonspacing Mark:] Remove; NFC; [:Punctuation:] Remove;", \Transliterator::FORWARD);

    return $transliterator->transliterate($string);
}
Faustinafaustine answered 14/6, 2020 at 13:0 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.