Localization of lists
Asked Answered
E

1

8

What is the right way to localize a list of strings? I know that the separator can be localized to a comma or a semi-colon but does the conjunction get localized? If so, what would my format string for an arbitrary length list look like?

Example

"Bat, Cat and Dog". I could use the separator as per locale and construct the LIST as the following:

LIST := UNIT
LISTMID := UNIT SEPARATOR UNIT
LISTMID := LISTMID SEPARATOR UNIT
LIST := UNIT CONJUNCTION UNIT
LIST := LISTMID CONJUNCTION UNIT

Would I have to craft this rule per language? Any libraries available to help with this?

Elihu answered 28/3, 2017 at 22:31 Comment(0)
L
5

I came here looking for an answer to the same question, and ended up doing more googling, which found this: http://icu-project.org/apiref/icu4j/com/ibm/icu/text/ListFormatter.html

The class takes Parameters two, start, middle, and end:

  • two - string for two items, containing {0} for the first, and {1} for the second.
  • start - string for the start of a list items, containing {0} for the first, and {1} for the rest.
  • middle - string for the start of a list items, containing {0} for the first part of the list, and {1} for the rest of the list.
  • end - string for the end of a list items, containing {0} for the first part of the list, and {1} for the last item.

So, for English, that would be:

 - TWO := "{0} and {1}"
 - START := "{0}, {1}"
 - MIDDLE := "{0}, {1}" 
 - END := "{0} and {1}"

I wrote a quick Lua demonstration for how I imagine this works:

function list_format(words, templates)
    local length = #words
    if length == 1 then return words[1] end
    if length == 2 then 
        return replace(replace(templates['two'], '{0}', words[1]), 
            '{1}', words[2])
    end

    local result = replace(templates['end'], '{1}', words[length])
    while length > 3 do
        length = length - 1
        local mid = replace(templates['middle'], '{1}', words[length])
        result = replace(result, '{0}', mid)
    end
    result = replace(result, '{0}', words[2])
    result = replace(templates['start'], '{1}', result)
    result = replace(result, '{0}', words[1])
    return result
end

function replace(template, index, text)
    str, _ = string.gsub(template, index, text)
    return str
end

local english = {
    ["two"] = "{0} and {1}",
    ["start"] = "{0}, {1}",
    ["middle"] = "{0}, {1}",
    ["end"] = "{0} and {1}"
}

print(list_format({"banana"}, english))
print(list_format({"banana", "apple"}, english))
print(list_format({"banana", "apple", "mango"}, english))
print(list_format({"banana", "apple", "mango", "pineapple"}, english))

It should be trivial to adapt this for other languages.

Licorice answered 20/9, 2019 at 17:46 Comment(2)
While this works for English, would it work for other languages especially right to left languages?Elihu
I have no experience with RTL languages, but the "weird" examples (Korean, Italian, Hungarian, Chinese) that I found when researching this would all be supported: https://mcmap.net/q/1470976/-making-a-variable-length-list-translatableLicorice

© 2022 - 2024 — McMap. All rights reserved.