l10n/i18n: how to handle phrases with dynamic list of items?
Asked Answered
D

2

7

What's the sanest way to handle translation and localization of dynamic lists?

Let's say I've queried the database, and got a list ["Foos", "Bars", "Bazes"]. Let's also assume the list always contain at least two items - I'll be sure to use a different translation for the single-item case.

What should I do if I need a phrase like "We have a wide choice of Foos, Bars and Bazes in our code"? (assuming that list items are dynamic so I can't just pre-translate all the possible permutations, and need to do things at runtime.)

I see at least the following issues:

  • I need to inflect all the items to the correct form (are there languages where different forms are required depending on the position in the list?)

  • Different locales may have drastically different rules how to join items.

    • E.g. CJK locales need "、" instead of ",".
    • And AFAIK in Chinese there will be "及" or "和" - depending on the full phrase - before the last item, so I guess there's some ambiguity with translating "and".
    • And, as I've read, some languages may avoid punctuation like it's used in English, but have other concepts instead, e.g. Arabic translator may prefer use "و" before every item (although they also have commas, "،"). Not sure if true or not - I don't know Arabic, just saw it mentioned.

My problem is, I don't even know what tooling may help me here. I don't have any particular programming language requirements, although Python or JavaScript would be the best. But I guess I can run just about anything, as I can probably build a l10n microservice and query it from my project.

I've used GNU gettext before I've encountered this, but I haven't found anything that would help me in its APIs and data formats. The best I can imagine is _("We have a wide choice of %s in our code", list_text) and generate list_text using some DIY hacks. I'm not sure XLIFF format has anything like this as well. I've found i18n-list-generator on npm but it's way too simplicistic.

Have anyone dealt with something like this? What did you do? Is there any library out there that handles this - so I can take a look at its API and learn how it does things?

Drainpipe answered 21/8, 2017 at 16:51 Comment(3)
The best practice is to avoid this kind of merging whenever possible and make the dynamic part more standalone (and therefore more easily translatable) by using e.g. colons and parentheses. Compare to "Found %d files" vs "Found files: %d".European
@GSerg, yes, not having to deal with the translations at all would be the easiest way to deal with them ;) I guess I can just make rules for selected "first-class" languages and leave it a dull machine-generated list for others (surely, would be much better than leaving it not translated at all). But if there someone knows about any way to deal with this nicely - I'm interested in learning about the model.Drainpipe
Do you have any new experience to share? An additional issue is that the strings you fetch from your database are either (a) technical names not translated at all or (b) names pre-translated server-side. Either way, the database name does not know the sentence context (e.g. grammar form) where the name is used - and client-side translations cannot be provided for the dynamic names by nature. I think in most cases it should be solved by layout (you don’t want to join 500 items in a single sentence), but I also came across cases where it’s much better to have a "simple" sentence (notifications).Byssinosis
E
4

Here's how I would approach it:

  1. No concatenation. All string joining needs to be done via format strings with placeholders.

  2. Only use format strings that support named/numbered placeholders. E.g. {FOO} or $1 instead of %s (this is to allow for parameter reordering). Named placeholders are also better since they give more context to translators. Let's assume we're using {FOO}-style placeholders.

  3. To render a list, I would use a couple of format strings, e.g.: joinItem = "{LIST}, {ITEM}" to append items to the list and joinLastItem = "{LIST} and {ITEM}" to append the last item. This will allow one to render strings like Foos, Bars and Bases, change punctuation and even reverse the ordering of the list, if necessary.

  4. Finally, you can use the final format string, e.g. weHaveTheseItems = "We have a wide choice of {ITEMS} in our code", assuming the {ITEMS} gets replaced with the previously rendered string.

Shameless self-promotion: you may want to have a look at the Plurr library that supports such {FOO}-style placeholders, as well as plurals (something you will likely need for such messages). It supports JavaScript among other languages.

Epiphragm answered 1/11, 2017 at 2:30 Comment(1)
I like 1 and 2. I find 3 and 4 interesting, but how many human languages can work with that? I had a look at the Plurr library and it appears to support little more than the original gettext (besides nicer syntax).Whitening
B
1

This is a pain, as you point out not all locales can be expected to support the ",,,,and" form.

Inspired by @GSerg and @Igor Afanasyev I came up with a GNU Gettext based solution like the following (pseudo gettext invocation):

GettextPlural(
    // TRANSLATORS: For multiple "choices", each will be prefixed with a new-line (\n)
    "We have a wide choice of {choices} in our code",
    "In our code we have a wide choice of{choices}", choices.Count)

should print like:

"We have a wide choice of FOOs in our code"

"In our code we have a wide choice of
FOOs
BARs
BAZs"

Remember to stick the --add-comments=TRANSLATORS to your xgettext invocation.

For Web purposes you could use <ul><li>...</li>... </ul> or whatever instead of \n.

The benefit is that layout is at least as universal as UI layout, but you are still allowing non-English'ish locale plural forms.

Some languages have only one plural form so their translation must work with both a single choice and multiple choices, so in particular, they cannot have a conditional new-line.

Bolection answered 12/7, 2019 at 8:59 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.