CultureInfo and ISO 639-3
Asked Answered
S

5

8

I'm searching a way to construct a CultureInfo object from a ISO 639-3 language code. I didn't find anything in the MSDN about that and trying to get it from the list of all cultures didn't work...

CultureInfo cInfo = CultureInfo.GetCultures(CultureTypes.AllCultures)
    .FirstOrDefault(r => String.Equals(r.ThreeLetterISOLanguageName, "CCH",
        StringComparison.CurrentCultureIgnoreCase));

will always return null (note that "CCH" is one language from the ISO-639-3 list).

Any idea is appreciated, thanks !

Spancake answered 10/1, 2014 at 10:37 Comment(1)
That's not going to fly, culture is not language. CultureInfo makes no attempt at mapping every possible language in the world. A language is a dialect with an army and a navy.Thornie
A
10

The MSDN documentation states that CultureInfo objects only have ISO 639-2 three-letter code and ISO 639-1 two-letter code. That means you are going to need a mapping of some kind in order to link your ISO 639-3 code to a specific CultureInfo instance.

This Wikipedia page has the table with the mappings. Maybe you could cut-and-paste into an XML file and use it as an embedded resource in a class library in order to provide the mapping. Or even just define a static Dictionary<string,string> somewhere.

Alternatively, I'm sure there will be a 3rd party library that can do this for you (though I don't know of any off the top of my head).

edit:

I hadn't realised ISO 639-3 codes only sometimes have a mapping to ISO 639-2 codes. The problem here is that the CultureInfo class isn't designed to handle the ISO 639-3 specification, so you may have to find a completely different 3rd party implementation of CultureInfo which will support this - or make it yourself.

Antonetteantoni answered 10/1, 2014 at 10:46 Comment(0)
S
5

I had a similar need to convert between ISO 639-2B/T and ISO 639-3 formats. I created a TT4 solution that generates a list of all the 7K+ entries at compile time. I could have used a dictionary instead of a list, but I am searching multiple fields, so not much value.

Download and extract the tab delimited text file from: http://www-01.sil.org/iso639-3/download.asp Copy it to your project path, rename as appropriate.

Create a design time template file: https://msdn.microsoft.com/en-us/library/dd820620.aspx

<#@ template debug="true" hostspecific="true" language="C#" #>
<#@ output extension=".cs" #>
<#@ assembly name="System.Core" #>
<#@ assembly name="Microsoft.VisualBasic.dll" #> 
<#@ import namespace="System.Linq" #>
<#@ import namespace="System.Text" #>
<#@ import namespace="System.Collections.Generic" #>
<#@ import namespace="Microsoft.VisualBasic.FileIO" #>

// Generated code
using System.Collections.Generic;

namespace Foo
{
    // ISO 639-3
    // http://www-01.sil.org/iso639-3/download.asp
    public class ISO_639_3
    {
        // The three-letter 639-3 identifier
        public string Id { get; set; }
        // Equivalent 639-2 identifier of the bibliographic applications code set, if there is one
        public string Part2B { get; set; }
        // Equivalent 639-2 identifier of the terminology applications code set, if there is one
        public string Part2T { get; set; }
        // Equivalent 639-1 identifier, if there is one
        public string Part1 { get; set; }
        // I(ndividual), M(acrolanguage), S(pecial)
        public string Scope { get; set; }
        // A(ncient), C(onstructed), E(xtinct), H(istorical), L(iving), S(pecial)
        public string Language_Type { get; set; }
        // Reference language name
        public string Ref_Name { get; set; }
        // Comment relating to one or more of the columns
        public string Comment { get; set; }

        // Create a list of all known codes
        public static List<ISO_639_3> Create()
        {
            List<ISO_639_3> list = new List<ISO_639_3> {
<# 
    // Setup text parser
    string filename = this.Host.ResolvePath("iso-639-3.tab"); 
    TextFieldParser tfp = new TextFieldParser(filename)
    {
        TextFieldType = FieldType.Delimited,
        Delimiters = new[] { ",", "\t" },
        HasFieldsEnclosedInQuotes = true,
        TrimWhiteSpace = true
    };

    // Read first row as header
    string[] header = tfp.ReadFields();

    // Read rows from file
    // For debugging limit the row count
    //int maxrows = 10;
    int maxrows = int.MaxValue;
    int rowcount = 0;
    string term = "";
    while (!tfp.EndOfData && rowcount < maxrows)
    {
        // Read row of data from the file
        string[] row = tfp.ReadFields();
        rowcount ++;

        // Add "," on all but last line
        term = tfp.EndOfData || rowcount >= maxrows ? "" : ",";

        // Add new item from row data
#>
                new ISO_639_3 { Id = "<#=row[0]#>", Part2B = "<#=row[1]#>", Part2T = "<#=row[2]#>", Part1 = "<#=row[3]#>", Scope = "<#=row[4]#>", Language_Type = "<#=row[5]#>", Ref_Name = "<#=row[6]#>", Comment = "<#=row[7]#>" }<#=term#>
<# 
    } 
#>  
            };
            return list;
        }

    }

}

The generated code will create an initializer for a list with all the languages. This file is big, it slows down editing speed, compilation takes a long time, keep it unloaded unless you need it. Snippet:

public static List<ISO_639_3> Create()
{
    List<ISO_639_3> list = new List<ISO_639_3> {
        new ISO_639_3 { Id = "aaa", Part2B = "", Part2T = "", Part1 = "", Scope = "I", Language_Type = "L", Ref_Name = "Ghotuo", Comment = "" },
        new ISO_639_3 { Id = "aab", Part2B = "", Part2T = "", Part1 = "", Scope = "I", Language_Type = "L", Ref_Name = "Alumu-Tesu", Comment = "" },
        new ISO_639_3 { Id = "aac", Part2B = "", Part2T = "", Part1 = "", Scope = "I", Language_Type = "L", Ref_Name = "Ari", Comment = "" },

Use the generated list to map as needed, e.g.

    public static ISO_639_3 GetISO_639_3(string language)
    {
        // Create list if it does not exist
        if (Program.Default.ISO6393List == null)
        {
            Program.Default.ISO6393List = ISO_639_3.Create();
        }

        // Match the input string type
        ISO_639_3 lang = null;
        if (language.Length > 3 && language.ElementAt(2) == '-')
        {
            // Treat the language as a culture form, e.g. en-us
            CultureInfo cix = new CultureInfo(language);

            // Recursively call using the ISO 639-2 code
            return GetISO_639_3(cix.ThreeLetterISOLanguageName);
        }
        else if (language.Length > 3)
        {
            // Try long form
            lang = Program.Default.ISO6393List.Where(item => item.Ref_Name.Equals(language, StringComparison.OrdinalIgnoreCase)).FirstOrDefault();
            if (lang != null)
                return lang;
        }
        else if (language.Length == 3)
        {

            // Try 639-3
            lang = Program.Default.ISO6393List.Where(item => item.Id.Equals(language, StringComparison.OrdinalIgnoreCase)).FirstOrDefault();
            if (lang != null)
                return lang;

            // Try the 639-2/B
            lang = Program.Default.ISO6393List.Where(item => item.Part2B.Equals(language, StringComparison.OrdinalIgnoreCase)).FirstOrDefault();
            if (lang != null)
                return lang;

            // Try the 639-2/T
            lang = Program.Default.ISO6393List.Where(item => item.Part2T.Equals(language, StringComparison.OrdinalIgnoreCase)).FirstOrDefault();
            if (lang != null)
                return lang;
        }
        else if (language.Length == 2)
        {
            // Try 639-1
            lang = Program.Default.ISO6393List.Where(item => item.Part1.Equals(language, StringComparison.OrdinalIgnoreCase)).FirstOrDefault();
            if (lang != null)
                return lang;
        }

        // Not found
        return lang;
    }
Sordino answered 12/11, 2017 at 0:6 Comment(1)
Here's the RegEx for parsing the file, if you prefer it. You can filter Scope and Type with only letters you want: ^(?<Id>\p{Ll}{3})\s(?<Part2B>\p{Ll}{3})?\s(?<Part2T>\p{Ll}{3})?\s(?<Part1>\p{Ll}{2})?\s(?<Scope>[CILMS])?\s(?<Type>[ACEKHLS])?\s(?<Name>[\p{L}\p{N}\p{Z}\p{P}a̱]*)?\sMacdougall
T
2

I found myself needed an enum for ISO 639-3. If you don't actually need to map it to CultureInfo then maybe this will help:

http://snipplr.com/view/76196/enum-for-iso-6393-language-codes/

Thrashing answered 14/8, 2014 at 20:16 Comment(4)
That list has a language called "=/Kx'au//'ein"... Surely that's a typo?Anemo
It wasn't my list, I just found it somewhere else and formatted it for C# then shared it here. That said: ethnologue.com/17/language/aueThrashing
Well now... look at that! thanks for sharing! It helped me out.Anemo
Beware: The enum is generated off a cropped list -- it only includes about 500 language codes. The actual standard contains over 7000 entries.Tangelatangelo
A
2

Please look into C#'s Text templates. (*.tt)

It will allow you to generate the file whenever you resave it in your project:

<#@import namespace="System.Globalization"#>
<#@ output extension=".cs" #>
namespace YourProject.Enum
{
    enum eLanguage
    {
        Unknown,
        <#
        CultureInfo[] cultures = CultureInfo.GetCultures(CultureTypes.AllCultures);
        foreach (var culture in cultures) { #>
        <#= culture.TwoLetterISOLanguageName #>
        <#
        }
        #>
        Other
    }
}
Asp answered 13/11, 2014 at 11:31 Comment(0)
R
2

Here is an enum on steroids

namespace System.Globalization
{
    using System.Collections.Generic;
    public enum SpokenLang
    {

        [Spoken.Lang("Afaraf", "aa", "aar")]
        Afar,
        [Spoken.Lang("Аҧсуа", "ab", "abk")]
        Abkhazian,
        [Spoken.Lang("Afrikaans", "af", "afr")]
        Afrikaans,
        [Spoken.Lang("Akan", "ak", "aka")]
        Akan,
        [Spoken.Lang("አማርኛ", "am", "amh")]
        Amharic,
        [Spoken.Lang("‫العربية", "ar", "ara")]
        Arabic,
        [Spoken.Lang("Aragonés", "an", "arg")]
        Aragonese,
        [Spoken.Lang("অসমীয়া", "as", "asm")]
        Assamese,
        [Spoken.Lang("авар мацӀ", "av", "ava")]
        Avaric,
        [Spoken.Lang("Avestan", "ae", "ave")]
        Avestan,
        [Spoken.Lang("Aymar aru", "ay", "aym")]
        Aymara,
        [Spoken.Lang("Azərbaycan dili", "az", "aze")]
        Azerbaijani,
        [Spoken.Lang("башҡорт теле", "ba", "bak")]
        Bashkir,
        [Spoken.Lang("Bamanankan", "bm", "bam")]
        Bambara,
        [Spoken.Lang("Беларуская", "be", "bel")]
        Belarusian,
        [Spoken.Lang("বাংলা", "bn", "ben")]
        Bengali,
        [Spoken.Lang("Bislama", "bi", "bis")]
        Bislama,
        [Spoken.Lang("བོད་ཡིག", "bo", "bod")]
        Tibetan,
        [Spoken.Lang("Bosanski jezik", "bs", "bos")]
        Bosnian,
        [Spoken.Lang("Brezhoneg", "br", "bre")]
        Breton,
        [Spoken.Lang("български език", "bg", "bul")]
        Bulgarian,
        [Spoken.Lang("Català", "ca", "cat")]
        Catalan,
        [Spoken.Lang("Česky", "cs", "ces")]
        Czech,
        [Spoken.Lang("Chamoru", "ch", "cha")]
        Chamorro,
        [Spoken.Lang("нохчийн мотт", "ce", "che")]
        Chechen,
        [Spoken.Lang("Словѣньскъ", "cu", "chu")]
        ChurchSlavic,
        [Spoken.Lang("чӑваш чӗлхи", "cv", "chv")]
        Chuvash,
        [Spoken.Lang("Kernewek", "kw", "cor")]
        Cornish,
        [Spoken.Lang("Corsu", "co", "cos")]
        Corsican,
        [Spoken.Lang("ᓀᐦᐃᔭᐍᐏᐣ", "cr", "cre")]
        Cree,
        [Spoken.Lang("Cymraeg", "cy", "cym")]
        Welsh,
        [Spoken.Lang("Dansk", "da", "dan")]
        Danish,
        [Spoken.Lang("Deutsch", "de", "deu")]
        German,
        [Spoken.Lang("‫ދިވެހި", "dv", "div")]
        Dhivehi,
        [Spoken.Lang("རྫོང་ཁ", "dz", "dzo")]
        Dzongkha,
        [Spoken.Lang("Ελληνικά", "el", "ell")]
        ModernGreek,
        [Spoken.Lang("English", "en", "eng")]
        English,
        [Spoken.Lang("Esperanto", "eo", "epo")]
        Esperanto,
        [Spoken.Lang("Eesti keel", "et", "est")]
        Estonian,
        [Spoken.Lang("Euskara", "eu", "eus")]
        Basque,
        [Spoken.Lang("Ɛʋɛgbɛ", "ee", "ewe")]
        Ewe,
        [Spoken.Lang("Føroyskt", "fo", "fao")]
        Faroese,
        [Spoken.Lang("‫فارسی", "fa", "fas")]
        Persian,
        [Spoken.Lang("Vosa Vakaviti", "fj", "fij")]
        Fijian,
        [Spoken.Lang("Suomen kieli", "fi", "fin")]
        Finnish,
        [Spoken.Lang("Français", "fr", "fra")]
        French,
        [Spoken.Lang("Frysk", "fy", "fry")]
        WesternFrisian,
        [Spoken.Lang("Fulfulde", "ff", "ful")]
        Fulah,
        [Spoken.Lang("Gàidhlig", "gd", "gla")]
        ScottishGaelic,
        [Spoken.Lang("Gaeilge", "ga", "gle")]
        Irish,
        [Spoken.Lang("Galego", "gl", "glg")]
        Galician,
        [Spoken.Lang("Ghaelg", "gv", "glv")]
        Manx,
        [Spoken.Lang("Avañe'ẽ", "gn", "grn")]
        Guarani,
        [Spoken.Lang("ગુજરાતી", "gu", "guj")]
        Gujarati,
        [Spoken.Lang("Kreyòl ayisyen", "ht", "hat")]
        Haitian,
        [Spoken.Lang("‫هَوُسَ", "ha", "hau")]
        Hausa,
        [Spoken.Lang("Serbo-Croatian", "sh", "hbs")]
        SerboCroatian,
        [Spoken.Lang("‫עברית", "he", "heb")]
        Hebrew,
        [Spoken.Lang("Otjiherero", "hz", "her")]
        Herero,
        [Spoken.Lang("हिन्दी", "hi", "hin")]
        Hindi,
        [Spoken.Lang("Hiri Motu", "ho", "hmo")]
        HiriMotu,
        [Spoken.Lang("Hrvatski", "hr", "hrv")]
        Croatian,
        [Spoken.Lang("magyar", "hu", "hun")]
        Hungarian,
        [Spoken.Lang("Հայերեն", "hy", "hye")]
        Armenian,
        [Spoken.Lang("Igbo", "ig", "ibo")]
        Igbo,
        [Spoken.Lang("Ido", "io", "ido")]
        Ido,
        [Spoken.Lang("ꆇꉙ", "ii", "iii")]
        SichuanYi,
        [Spoken.Lang("ᐃᓄᒃᑎᑐᑦ", "iu", "iku")]
        Inuktitut,
        [Spoken.Lang("Interlingue", "ie", "ile")]
        Interlingue,
        [Spoken.Lang("Interlingua", "ia", "ina")]
        Interlingua,
        [Spoken.Lang("Bahasa Indonesia", "id", "ind")]
        Indonesian,
        [Spoken.Lang("Iñupiaq", "ik", "ipk")]
        Inupiaq,
        [Spoken.Lang("Íslenska", "is", "isl")]
        Icelandic,
        [Spoken.Lang("Italiano", "it", "ita")]
        Italian,
        [Spoken.Lang("Basa Jawa", "jv", "jav")]
        Javanese,
        [Spoken.Lang("日本語", "ja", "jpn")]
        Japanese,
        [Spoken.Lang("Kalaallisut", "kl", "kal")]
        Kalaallisut,
        [Spoken.Lang("ಕನ್ನಡ", "kn", "kan")]
        Kannada,
        [Spoken.Lang("कश्मीरी", "ks", "kas")]
        Kashmiri,
        [Spoken.Lang("ქართული", "ka", "kat")]
        Georgian,
        [Spoken.Lang("Kanuri", "kr", "kau")]
        Kanuri,
        [Spoken.Lang("Қазақ тілі", "kk", "kaz")]
        Kazakh,
        [Spoken.Lang("ភាសាខ្មែរ", "km", "khm")]
        Khmer,
        [Spoken.Lang("Gĩkũyũ", "ki", "kik")]
        Kikuyu,
        [Spoken.Lang("Kinyarwanda", "rw", "kin")]
        Kinyarwanda,
        [Spoken.Lang("кыргыз тили", "ky", "kir")]
        Kirghiz,
        [Spoken.Lang("коми кыв", "kv", "kom")]
        Komi,
        [Spoken.Lang("KiKongo", "kg", "kon")]
        Kongo,
        [Spoken.Lang("한국어", "ko", "kor")]
        Korean,
        [Spoken.Lang("Kuanyama", "kj", "kua")]
        Kuanyama,
        [Spoken.Lang("Kurdî", "ku", "kur")]
        Kurdish,
        [Spoken.Lang("ພາສາລາວ", "lo", "lao")]
        Lao,
        [Spoken.Lang("Latine", "la", "lat")]
        Latin,
        [Spoken.Lang("Latviešu valoda", "lv", "lav")]
        Latvian,
        [Spoken.Lang("Limburgs", "li", "lim")]
        Limburgan,
        [Spoken.Lang("Lingála", "ln", "lin")]
        Lingala,
        [Spoken.Lang("Lietuvių kalba", "lt", "lit")]
        Lithuanian,
        [Spoken.Lang("Lëtzebuergesch", "lb", "ltz")]
        Luxembourgish,
        [Spoken.Lang("kiluba", "lu", "lub")]
        LubaKatanga,
        [Spoken.Lang("Luganda", "lg", "lug")]
        Ganda,
        [Spoken.Lang("Kajin M̧ajeļ", "mh", "mah")]
        Marshallese,
        [Spoken.Lang("മലയാളം", "ml", "mal")]
        Malayalam,
        [Spoken.Lang("मराठी", "mr", "mar")]
        Marathi,
        [Spoken.Lang("македонски јазик", "mk", "mkd")]
        Macedonian,
        [Spoken.Lang("Fiteny malagasy", "mg", "mlg")]
        Malagasy,
        [Spoken.Lang("Malti", "mt", "mlt")]
        Maltese,
        [Spoken.Lang("Монгол", "mn", "mon")]
        Mongolian,
        [Spoken.Lang("Te reo Māori", "mi", "mri")]
        Maori,
        [Spoken.Lang("Bahasa Melayu", "ms", "msa")]
        Malay,
        [Spoken.Lang("ဗမာစာ", "my", "mya")]
        Burmese,
        [Spoken.Lang("Ekakairũ Naoero", "na", "nau")]
        Nauru,
        [Spoken.Lang("Diné bizaad", "nv", "nav")]
        Navajo,
        [Spoken.Lang("Ndébélé", "nr", "nbl")]
        SouthNdebele,
        [Spoken.Lang("isiNdebele", "nd", "nde")]
        NorthNdebele,
        [Spoken.Lang("Owambo", "ng", "ndo")]
        Ndonga,
        [Spoken.Lang("नेपाली", "ne", "nep")]
        Nepali,
        [Spoken.Lang("Nederlands", "nl", "nld")]
        Dutch,
        [Spoken.Lang("Norsk nynorsk", "nn", "nno")]
        NorwegianNynorsk,
        [Spoken.Lang("Norsk bokmål", "nb", "nob")]
        NorwegianBokmål,
        [Spoken.Lang("Norsk", "no", "nor")]
        Norwegian,
        [Spoken.Lang("ChiCheŵa", "ny", "nya")]
        Nyanja,
        [Spoken.Lang("Occitan", "oc", "oci")]
        Occitan,
        [Spoken.Lang("ᐊᓂᔑᓈᐯᒧᐎᓐ", "oj", "oji")]
        Ojibwa,
        [Spoken.Lang("ଓଡ଼ିଆ", "or", "ori")]
        Oriya,
        [Spoken.Lang("Afaan Oromoo", "om", "orm")]
        Oromo,
        [Spoken.Lang("Ирон ӕвзаг", "os", "oss")]
        Ossetian,
        [Spoken.Lang("ਪੰਜਾਬੀ", "pa", "pan")]
        Panjabi,
        [Spoken.Lang("पािऴ", "pi", "pli")]
        Pali,
        [Spoken.Lang("Polski", "pl", "pol")]
        Polish,
        [Spoken.Lang("Português", "pt", "por")]
        Portuguese,
        [Spoken.Lang("‫پښتو", "ps", "pus")]
        Pushto,
        [Spoken.Lang("Runa Simi", "qu", "que")]
        Quechua,
        [Spoken.Lang("Rumantsch grischun", "rm", "roh")]
        Romansh,
        [Spoken.Lang("Română", "ro", "ron")]
        Romanian,
        [Spoken.Lang("kiRundi", "rn", "run")]
        Rundi,
        [Spoken.Lang("русский язык", "ru", "rus")]
        Russian,
        [Spoken.Lang("Yângâ tî sängö", "sg", "sag")]
        Sango,
        [Spoken.Lang("संस्कृतम्", "sa", "san")]
        Sanskrit,
        [Spoken.Lang("සිංහල", "si", "sin")]
        Sinhala,
        [Spoken.Lang("Slovenčina", "sk", "slk")]
        Slovak,
        [Spoken.Lang("Slovenščina", "sl", "slv")]
        Slovenian,
        [Spoken.Lang("Davvisámegiella", "se", "sme")]
        NorthernSami,
        [Spoken.Lang("Gagana fa'a Samoa", "sm", "smo")]
        Samoan,
        [Spoken.Lang("chiShona", "sn", "sna")]
        Shona,
        [Spoken.Lang("सिन्धी", "sd", "snd")]
        Sindhi,
        [Spoken.Lang("Soomaaliga", "so", "som")]
        Somali,
        [Spoken.Lang("seSotho", "st", "sot")]
        SouthernSotho,
        [Spoken.Lang("Español", "es", "spa")]
        Spanish,
        [Spoken.Lang("Shqip", "sq", "sqi")]
        Albanian,
        [Spoken.Lang("sardu", "sc", "srd")]
        Sardinian,
        [Spoken.Lang("српски језик", "sr", "srp")]
        Serbian,
        [Spoken.Lang("SiSwati", "ss", "ssw")]
        Swati,
        [Spoken.Lang("Basa Sunda", "su", "sun")]
        Sundanese,
        [Spoken.Lang("Kiswahili", "sw", "swa")]
        Swahili,
        [Spoken.Lang("Svenska", "sv", "swe")]
        Swedish,
        [Spoken.Lang("Reo Mā`ohi", "ty", "tah")]
        Tahitian,
        [Spoken.Lang("தமிழ்", "ta", "tam")]
        Tamil,
        [Spoken.Lang("татарча", "tt", "tat")]
        Tatar,
        [Spoken.Lang("తెలుగు", "te", "tel")]
        Telugu,
        [Spoken.Lang("тоҷикӣ", "tg", "tgk")]
        Tajik,
        [Spoken.Lang("Tagalog", "tl", "tgl")]
        Tagalog,
        [Spoken.Lang("ไทย", "th", "tha")]
        Thai,
        [Spoken.Lang("ትግርኛ", "ti", "tir")]
        Tigrinya,
        [Spoken.Lang("faka Tonga", "to", "ton")]
        Tonga,
        [Spoken.Lang("seTswana", "tn", "tsn")]
        Tswana,
        [Spoken.Lang("xiTsonga", "ts", "tso")]
        Tsonga,
        [Spoken.Lang("Türkmen", "tk", "tuk")]
        Turkmen,
        [Spoken.Lang("Türkçe", "tr", "tur")]
        Turkish,
        [Spoken.Lang("Twi", "tw", "twi")]
        Twi,
        [Spoken.Lang("Uyƣurqə", "ug", "uig")]
        Uighur,
        [Spoken.Lang("українська мова", "uk", "ukr")]
        Ukrainian,
        [Spoken.Lang("‫اردو", "ur", "urd")]
        Urdu,
        [Spoken.Lang("O'zbek", "uz", "uzb")]
        Uzbek,
        [Spoken.Lang("tshiVenḓa", "ve", "ven")]
        Venda,
        [Spoken.Lang("Tiếng Việt", "vi", "vie")]
        Vietnamese,
        [Spoken.Lang("Volapük", "vo", "vol")]
        Volapük,
        [Spoken.Lang("Walon", "wa", "wln")]
        Walloon,
        [Spoken.Lang("Wollof", "wo", "wol")]
        Wolof,
        [Spoken.Lang("isiXhosa", "xh", "xho")]
        Xhosa,
        [Spoken.Lang("‫ייִדיש", "yi", "yid")]
        Yiddish,
        [Spoken.Lang("Yorùbá", "yo", "yor")]
        Yoruba,
        [Spoken.Lang("Saɯ cueŋƅ", "za", "zha")]
        Zhuang,
        [Spoken.Lang("中文", "zh", "zho")]
        Chinese,
        [Spoken.Lang("isiZulu", "zu", "zul")]
        Zulu,

    }


    public static class Spoken
    {
        public class LangAttribute : Attribute
        {
            public LangAttribute(string name, string iso639_1, string iso639_3)
            {
                NativeName = name;
                Iso639_1 = iso639_1;
                Iso639_3 = iso639_3;
            }
            public string Iso639_1 { get; }
            public string Iso639_3 { get; }
            public string NativeName { get; }

        }


        static readonly Dictionary<string, SpokenLang> isoMap = new Dictionary<string, SpokenLang>();
        static readonly Dictionary<SpokenLang, CultureInfo> langToCultureInfo = new Dictionary<SpokenLang, CultureInfo>();

        static Spoken()
        {

            foreach (SpokenLang lang in Enum.GetValues(typeof(SpokenLang)))
            {
                var langAttr = lang.GetLangAttribute();
                isoMap.Add(langAttr.Iso639_1, lang);
                isoMap.Add(langAttr.Iso639_3, lang);
                isoMap.Add(lang.ToString(), lang);
                isoMap.Add(langAttr.NativeName, lang);
            }
        }

        public static bool TryParseLanguageString(string str, out SpokenLang result)
       => isoMap.TryGetValue(str, out result);

        public static SpokenLang ParseLanguageString(string str)
        {
            if (!TryParseLanguageString(str, out var result))
                throw new ArgumentException($"unknown language string: {str}");

            return result;
        }

        public static LangAttribute GetLangAttribute(this SpokenLang enumVal)
        {
            var type = enumVal.GetType();
            var memInfo = type.GetMember(enumVal.ToString());
            var attributes = memInfo[0].GetCustomAttributes(typeof(LangAttribute), false);
            return (attributes.Length > 0) ? (LangAttribute)attributes[0] : null;
        }

        public static CultureInfo ToCultureInfo(this SpokenLang language) =>
         CultureInfo.GetCultureInfo(language.GetLangAttribute().Iso639_1);
    }
}

built from @PieterV 's t4 template

Realpolitik answered 15/11, 2018 at 23:59 Comment(1)
Can someone mark this the accepted answer? Personally I would have made differnt dictionaries for ISO 639_1 and 639_3 but that is style and usage. Also I would rename GetLangAttribute to FindLangAttribute since it can return null. And StringComparer.InvariantCultureIgnoreCase as comparer for the dictionary. But just suggestions.Bibliophage

© 2022 - 2024 — McMap. All rights reserved.