Converting language names to ISO 639 language codes
Asked Answered
V

3

6

I need to convert language names like 'Hungarian', 'English' to ISO 639 codes. ISO 639-6 would be the best but ISO 639-2 is good enough. What's the best way to achieve this?

I should convert the English to locale and get the language with getLanguage()? If thats the only way how can I convert a string like 'English' to a java locale?

My goal is to store book language info using the ISO 639 codes.

Vasileior answered 14/4, 2015 at 16:13 Comment(2)
What have you tried so far? I know that java.util.Locale has some basic support for thisDerward
I have created an enhanced ISO 639 enumeration (and other ISO enums as well). The code is available there: github.com/scout-2766/Iso4J/blob/master/README.md (free of charge)Constantine
S
4

You can get a list of ISO 639-2 codes by passing a regular expression of language names to LanguageAlpha3Code.findByName(String) (in nv-i18n library).

The following example code is a command-line tool that converts given language names into corresponding ISO 639-2 codes.

import java.util.List;
import com.neovisionaries.i18n.LanguageAlpha3Code;

public class To639_2
{
    public static void main(String[] args)
    {
        // For each language name given on the command line.
        for (String languageName : args)
        {
            // Get a list of ISO 639-2 codes (alpha-3 codes)
            // whose language name matches the given pattern.
            List<LanguageAlpha3Code> list
                = LanguageAlpha3Code.findByName(languageName);

            // Print the language and the ISO 639-2 code.
            System.out.format("%s => %s\n", languageName,
                (list.size() != 0) ? list.get(0) : "");
        }
    }
}

A sample execution:

$ java -cp nv-i18n-1.14.jar:. To639_2 Hungarian English
Hungarian => hun
English => eng
Sybil answered 14/4, 2015 at 17:24 Comment(1)
I believe to get ISO 639-2 you would have to replace list.get(0) with list.get(0)..getAlpha3B() or else you are just getting ISO3 just like in getISO3Language() in Locale.Statics
L
5
    for (Locale locale : Locale.getAvailableLocales()) {
        System.out.println("" + locale
                + "; display: " + locale.getDisplayLanguage()
                + "; name: " + locale.getDisplayName()
                + "; lang: " + locale.getLanguage()
                + "; iso3: " + locale.getISO3Language());
    }

This will find some 150 locales, where ISO3 is the three letter variant, as opposed to the older two letter getLanguage.

The display language is the bare language name, whereas the display name is embellished with the country "German (Austria)."

So

public String toISO3(String name) {
    for (Locale locale : Locale.getAvailableLocales()) {
        if (name.equals(locale.getDisplayLanguage()) {
            return locale.getISO3Language();
        }
    }
    throw new IllegalArgumentException("No language found: " + name);
}

For the display methods there is an optional Locale parameter, to explicitly set to Locale.ENGLISH.

Lamellirostral answered 14/4, 2015 at 16:40 Comment(1)
Please note that this approach matches one language name per pair of locales and does not capture variations such as "अंग्रेजी" and "अँग्रेज़ी" (both are ways of saying English in Hindi).Entresol
S
4

You can get a list of ISO 639-2 codes by passing a regular expression of language names to LanguageAlpha3Code.findByName(String) (in nv-i18n library).

The following example code is a command-line tool that converts given language names into corresponding ISO 639-2 codes.

import java.util.List;
import com.neovisionaries.i18n.LanguageAlpha3Code;

public class To639_2
{
    public static void main(String[] args)
    {
        // For each language name given on the command line.
        for (String languageName : args)
        {
            // Get a list of ISO 639-2 codes (alpha-3 codes)
            // whose language name matches the given pattern.
            List<LanguageAlpha3Code> list
                = LanguageAlpha3Code.findByName(languageName);

            // Print the language and the ISO 639-2 code.
            System.out.format("%s => %s\n", languageName,
                (list.size() != 0) ? list.get(0) : "");
        }
    }
}

A sample execution:

$ java -cp nv-i18n-1.14.jar:. To639_2 Hungarian English
Hungarian => hun
English => eng
Sybil answered 14/4, 2015 at 17:24 Comment(1)
I believe to get ISO 639-2 you would have to replace list.get(0) with list.get(0)..getAlpha3B() or else you are just getting ISO3 just like in getISO3Language() in Locale.Statics
T
1
/**
 * This method is to get the language code from given language name
 * as locale can't be instantiate from a language name.
 *
 * You can specify which language you are at : Locale loc=new Locale("en") use whatever your language is
 * 
 * @param lng -> given language name eg.: English
 * @return -> will return "eng"
 *
 * Wilson M Penha Jr.
 */
private String getLanguageCode(String lng){
    Locale loc = new Locale("en");
    String[] name = loc.getISOLanguages(); // list of language codes

    for (int i = 0; i < name.length; i++) {
        Locale locale = new Locale(name[i],"US");
        // get the language name in english for comparison
        String langLocal = locale.getDisplayLanguage(loc).toLowerCase();
        if (lng.equals(langLocal)){
            return locale.getISO3Language();
        }
    }
    return "unknown";
}
Thunderstruck answered 23/5, 2017 at 19:9 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.