Getting the user's region with navigator.language
Asked Answered
C

7

23

For some time, I've been using something like this to get my user's country (ISO-3166):

const region = navigator.language.split('-')[1]; // 'US'

I've always assumed the string would be similar to en-US -- where the country would hold the 2nd position of the array.

I am thinking this assumption is incorrect. According to MDN docs, navigator.language returns: "string representing the language version as defined in BCP 47." Reading BCP 47, the primary language subtag is guaranteed to be first (e.g., 'en') but the region code is not guaranteed to be the 2nd subtag. There can be subtags that preceed and follow the region subtag.

For example "sr-Latn-RS" is a valid BCP 47 language tag:

sr                |  Latn           |  RS
primary language  |  script subtag  |  region subtag

Is the value returned from navigator.language a subset of BCP 47 containing only language and region? Or is there a library or regex that is commonly used to extract the region subtag from a language tag?

Colbert answered 29/8, 2016 at 19:35 Comment(0)
C
2

Regex found here: https://github.com/gagle/node-bcp47/blob/master/lib/index.js

var re = /^(?:(en-GB-oed|i-ami|i-bnn|i-default|i-enochian|i-hak|i-klingon|i-lux|i-mingo|i-navajo|i-pwn|i-tao|i-tay|i-tsu|sgn-BE-FR|sgn-BE-NL|sgn-CH-DE)|(art-lojban|cel-gaulish|no-bok|no-nyn|zh-guoyu|zh-hakka|zh-min|zh-min-nan|zh-xiang))$|^((?:[a-z]{2,3}(?:(?:-[a-z]{3}){1,3})?)|[a-z]{4}|[a-z]{5,8})(?:-([a-z]{4}))?(?:-([a-z]{2}|\d{3}))?((?:-(?:[\da-z]{5,8}|\d[\da-z]{3}))*)?((?:-[\da-wy-z](?:-[\da-z]{2,8})+)*)?(-x(?:-[\da-z]{1,8})+)?$|^(x(?:-[\da-z]{1,8})+)$/i;

let foo = re.exec('de-AT');      // German in Austria
let bar = re.exec('zh-Hans-CN'); // Simplified Chinese using Simplified script in mainland China

console.log(`region ${foo[5]}`); // 'region AT'
console.log(`region ${bar[5]}`); // 'region CN'
Colbert answered 2/9, 2016 at 1:52 Comment(4)
why Regex when you just can use split like below: const parts = navigator.language.split('-'); const region = parts[parts.length-1]Patrica
region is not guaranteed to be in 2nd position of array after split. See above example.Colbert
that's right. but above code always taking the last one regardless of it's actual position. isn't it?Patrica
not necessarily last position either. Other subtags may come after region.Colbert
P
11

Your solution is based on the false premise that the browser's language tag reliably matches the user's country. E.g., I have set my browser language to German, even though I am living nowhere near Germany at the moment, but rather in the United States.

Also, for example in Chrome, many language packs do not require you to specify the region modifier. Setting Chrome's display language to German

enter image description here

provides the following language tag:

> navigator.language
< "de"

No region tag at all, and a fairly common language.

Bottom line is, my browser setup results in language tag de, even though I live in the United States.


A more accurate and possibly reliable way to determine the user's location would be to derive it from the IP address associated with the request. There are numerous services that offer this service. ip-api.com is one of them:

$.get("http://ip-api.com/json", function(response) {
  console.log(response.country);     // "United States"
  console.log(response.countryCode); // "US"
}, "jsonp");
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
Proudman answered 1/9, 2016 at 19:36 Comment(1)
Interesting. For my application, 100% support is not a prerequisite for shipment. All I need to do is get a best guess at the user's region. However in your example, I think you may be getting language and region confused. "de" is a language subtag not a region subtag. Language to region is not one-to-one. For example, "de-AT" represents German ('de') as it is used in Austria ('AT'). Perhaps the best thing to do is use a combination of APIs: Geolocation, navigator.languages and some rest endpoint. Thanks for the input, but I don't think this precisely answers my question.Colbert
B
8

You can now extract the region from a locale identifier using the Locale object in the Internationalization API.

const { region } = new Intl.Locale('sr-Latn-RS') // region => 'RS'

Note that this is not currently compatible with Internet Explorer.

Byte answered 8/8, 2021 at 23:26 Comment(0)
C
2

Regex found here: https://github.com/gagle/node-bcp47/blob/master/lib/index.js

var re = /^(?:(en-GB-oed|i-ami|i-bnn|i-default|i-enochian|i-hak|i-klingon|i-lux|i-mingo|i-navajo|i-pwn|i-tao|i-tay|i-tsu|sgn-BE-FR|sgn-BE-NL|sgn-CH-DE)|(art-lojban|cel-gaulish|no-bok|no-nyn|zh-guoyu|zh-hakka|zh-min|zh-min-nan|zh-xiang))$|^((?:[a-z]{2,3}(?:(?:-[a-z]{3}){1,3})?)|[a-z]{4}|[a-z]{5,8})(?:-([a-z]{4}))?(?:-([a-z]{2}|\d{3}))?((?:-(?:[\da-z]{5,8}|\d[\da-z]{3}))*)?((?:-[\da-wy-z](?:-[\da-z]{2,8})+)*)?(-x(?:-[\da-z]{1,8})+)?$|^(x(?:-[\da-z]{1,8})+)$/i;

let foo = re.exec('de-AT');      // German in Austria
let bar = re.exec('zh-Hans-CN'); // Simplified Chinese using Simplified script in mainland China

console.log(`region ${foo[5]}`); // 'region AT'
console.log(`region ${bar[5]}`); // 'region CN'
Colbert answered 2/9, 2016 at 1:52 Comment(4)
why Regex when you just can use split like below: const parts = navigator.language.split('-'); const region = parts[parts.length-1]Patrica
region is not guaranteed to be in 2nd position of array after split. See above example.Colbert
that's right. but above code always taking the last one regardless of it's actual position. isn't it?Patrica
not necessarily last position either. Other subtags may come after region.Colbert
R
2

Be careful you have navigator.language and navigator.languages.

langage :

 console.log(navigator.language); // "fr"

langages :

 console.log(navigator.languages); // ["fr", "fr-FR", "en-US", "en"]

To find countries see Wikipedia on ISO 3166-1 or use javascript lib :

Roundel answered 8/9, 2016 at 17:24 Comment(1)
if (Langage != Language) else if (Langages != Languages) ;)Esma
C
1

In Firefox, you can choose your language settings in preferences:

enter image description here

The list of languages has 269 items, 192 of which do not include any region code.

The region is only useful when a language has different variants depending on the location. This way users can tell the server in which language variant they prefer the response to be.

Do not use this approach to locate the user. It's too unreliable, because the user may not specify any region, or because the user could physically be in another place.

If you want to locate the user, you should use the Geolocation API.

Cyclostyle answered 1/9, 2016 at 21:15 Comment(3)
See my comment in TimoSta's answer: Language != Region and not 1-1. What I'm interested in getting is the user's region, not language.Colbert
@Colbert Yes. What you might get with navigator.language is the region in which the user's preferred language variant is spoken. That's not the region where the user is, which you can get with the Geolocation API.Cyclostyle
I need user's permission to get location, right? It sounds like a comprehensive solution would involve using many of these and/or allowing user to select region. But I was really just wondering if there is a decent way to parse BCP 47 language tags to extract the region if one is provided. I would think this was a fairly common need.Colbert
W
0

Just as @TimoSta said,

Try this

$.getJSON('http://freegeoip.net/json/', function(result) {
   alert(result.country_code);
});

from Get visitors language & country code with javascript (client-side). See answer of @noducks

Washout answered 8/9, 2016 at 7:2 Comment(0)
K
0

The value you are receiving stems from the Accept-Language header of the HTTP request.

The values of the header can be quite complex like

Accept-Language: da, en-GB;q=0.8, en;q=0.7

As the name implies, the Accept-Language header basically defines acceptable languages, not countries.

A language tag may contain also additional location information, as in 'en-GB' but others like 'en' do not.

In case not, there is just no information about the country.

It is also not always possible to exactly map a language like 'en' to a country. If the language is 'en', the country might be 'GB' but it may also be 'US'.

What you can do ;

  • Determine the country only, if the language contains one, as in 'en-GB'
  • If the language does not contain a country you have the following options :
  • A few languages are only used in one country, like 'da', danish which is spoken only in Denmark (I am guessing here), so you may map these cases.
  • You may use a default for other cases, depending on the language, e.g. map 'en' to 'GB'
  • You may use a general default like 'US' for all cases no country can be determined.
  • You can use additional information e.g. the clients IP address to determine the country
  • Finally you may ask the user to enter the country

I collected some additional information about the Accept-Language header here

Knobloch answered 8/9, 2016 at 18:1 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.