Sort an array with special characters in PHP
Asked Answered
S

5

14

I have an array that holds the names of languages in spanish:

$lang["ko"] = "coreano"; //korean
$lang["ar"] = "árabe"; //arabic
$lang["es"] = "español"; //spanish
$lang["fr"] = "francés"; //french

I need to order the array and maintain index association, so I use asort() with the SORT_LOCALE_STRING

setlocale(LC_ALL,'es_ES.UTF-8'); //this is at the beginning (config file)
asort($lang,SORT_LOCALE_STRING);
print_r($lang);

The expected output would be in this order:

  • Array ( [ar] => árabe [ko] => coreano [es] => español [fr] => francés )

However, this is what I'm receiving:

  • Array ( [ko] => coreano [es] => español [fr] => francés [ar] => árabe )

Am I missing something? Thanks for your feedback! (my server is using PHP Version 5.2.13)

Shoe answered 18/5, 2012 at 8:48 Comment(6)
Wild guess: possibly because c comes before á ?Pebble
That's why i'm using SORT_LOCALE_STRING. 'á' should come after 'a' and before 'c'.Shoe
Did you check the return value of setlocale? Most probably it simply failed.Bolger
yeap, it's fine. it works with all other locale functions like strftime().Shoe
@andufo: What does "it's fine" mean? Also, what OS are you on?Bolger
possible duplicate of sort array with special characters in phpBerserker
R
19

Try sorting by translitterated names:

function compareASCII($a, $b) {
    $at = iconv('UTF-8', 'ASCII//TRANSLIT', $a);
    $bt = iconv('UTF-8', 'ASCII//TRANSLIT', $b);
    return strcmp($at, $bt);
}

uasort($lang, 'compareASCII');

print_r($lang);
Reify answered 18/5, 2012 at 8:54 Comment(5)
This may work for this specific case, but it's not a robust general solution; what happens if you want to sort an array containing, for example, strings of Cyrillic or Greek letters? ASCII transliteration isn't particularly reliable.Valetudinarian
@WillVousden You are right. Anyway, for an array containing names of languages it's ok I think.Reify
@lorenzo-s: Will is right, and the contents of the array don't come into it (what if it were names of languages in Greek?). This solution may be creative, but it's fundamentally flawed on a technical level. It would be massively better to just troubleshoot the problem, since the original code works for other people.Bolger
@lorenzo-s: It probably is OK, but then if it's later decided that the names of the languages should be in the languages themselves (and respective alphabets), e.g. Korean: 한국의, then there might be a problem :)Valetudinarian
@WillVousden good point. In fact that same code should work for chinese and hebrew characters as well.Shoe
P
3

You defined your locale incorrectly in setlocale().

Change:

setlocale(LC_ALL,'es_ES.UTF-8');

To:

setlocale(LC_ALL,'es_ES');

Output:

Array ( [ar] => árabe [ko] => coreano [es] => español [fr] => francés ) 
Peremptory answered 18/5, 2012 at 8:55 Comment(8)
I did try that too, but it returns the same response: Array ( [ko] => coreano [es] => español [fr] => francés [ar] => árabe )Shoe
That locale is 100% correct if the file is encoded in UTF-8. In any case, locale suffix and file encoding should match.Bolger
@andufo Try running it at phptester.net it works fine for me there. If that's the case you should see Jon's comment and check how your file is encoded.Peremptory
@GeorgeReith you're right. Just tested it and it worked fine in phptester.net -- any ideas on why it isn't working on my server? The file is UTF-8 encoded.Shoe
@andufo Not sure, try using utf8_encode() on the strings as you put them into the array.Peremptory
@andufo Try setlocale(LC_ALL,'es_ES.ISO-8859-1'); they use ISO-8859-1 encoding at phptester.net.Peremptory
didn't work neither, but i just tested the code on my production server (in a test file) and it worked perfectly just by setting the correct locale. thanks for the help!Shoe
@andufo Awesome, I suggest you run echo mb_internal_encoding(); on your testing server to see what your file is actually encoded as. You can then set it correctly such as mb_internal_encoding("UTF-8");, your .htaccess if your on apache may be altering the encoding your pages are served in.Peremptory
B
1

The documentation for setlocale mentions that

Different systems have different naming schemes for locales.

It's possible that your system does not recognize the locale as es_ES. If you are on Windows, try esp_ESP instead.

Bolger answered 18/5, 2012 at 9:15 Comment(0)
J
0

Try this

setlocale(LC_COLLATE, 'nl_BE.utf8');
$array = array('coreano','árabe','español','francés');
usort($array, 'strcoll'); 
print_r($array);
Jardine answered 18/5, 2012 at 8:59 Comment(0)
M
0

This is a non problem!

Your initial solution works exactly as expected, Your problem is the setlocale function that is failing to set the locale and by consequence the asort($array, SORT_LOCALE_STRING) fails to sort as you expect it

You can try your own code at phptester.net that does accept setlocale():

$lang["ko"] = "coreano"; //korean
$lang["ar"] = "árabe"; //arabic
$lang["es"] = "español"; //spanish
$lang["fr"] = "francés"; //french

asort($lang,SORT_LOCALE_STRING);
echo "<pre>";
print_r($lang);
echo "</pre>";

echo "<pre>";
/*this should return es_ES; 
if returns false it has failed and asort wont return expected order
*/
var_dump(setlocale(LC_ALL,'es_ES')); 
echo "</pre>";

asort($lang,SORT_LOCALE_STRING);
echo "<pre>";
print_r($lang);
Mahican answered 20/12, 2019 at 18:24 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.