Why does Chrome incorrectly determine page is in a different language and offer to translate?
Asked Answered
C

6

212

The new Google Chrome auto-translation feature is tripping up on one page within one of our applications. Whenever we navigate to this particular page, Chrome tells us the page is in Danish and offers to translate. The page is in English, just like every other page in our app. This particular page is an internal testing page that has a few dozen form fields with English labels. I have no idea why Chrome thinks this page is Danish.

Does anyone have insights into how this language detection feature works and how I can determine what is causing Chrome to think the page is in Danish?

Chickpea answered 18/3, 2010 at 3:58 Comment(4)
This is a long shot, but does the page have very few words? Try some other pages that have few words, do they exhibit the same symptom? My guess is there's a configuration somewhere on the server that sets the locale to danish, and because there are not enough words on the page to determine the language, chrome just goes with the server's assumption.Deidradeidre
See also https://mcmap.net/q/128724/-how-to-specify-your-webpage-39-s-language-so-google-chrome-doesn-39-t-offer-to-translate-itMeanwhile
Norweigian Bokmal here. I used the word 'Barf' on a few buttons. I changed the word to 'Bounce' and now Chrome thinks it's Dutch. Whaaaaaat?Breakage
@Breakage Dutch guy here. 'Barf' is not even a Dutch word that I ever heard of! Also no idea why Google thinks it's Dutch :pCoverlet
V
252

Update: according to Google

We don’t use any code-level language information such as lang attributes.

They recommend you make it obvious what your site's language is. Use the following which seems to help although Content-Language is deprecated and Google says they ignore lang

<html lang="en" xml:lang="en" xmlns= "http://www.w3.org/1999/xhtml">
<meta charset="UTF-8">
<meta name="google" content="notranslate">
<meta http-equiv="Content-Language" content="en">

If that doesn't work, you can always place a bunch of text (your "About" page for instance) in a hidden div. That might help with SEO as well.

EDIT (and more info)

The OP is asking about Chrome, so Google's recommendation is posted above. There are generally three ways to accomplish this for other browsers:

  1. W3C recommendation: Use the lang and/or xml:lang attributes in the html tag:

    <html lang="en" xml:lang="en" xmlns= "http://www.w3.org/1999/xhtml">
    
  2. UPDATE: previously a Google recommendation now deprecated spec although it may still help with Chrome. : meta http-equiv (as described above):

    <meta http-equiv="Content-Language" content="en">
    
  3. Use HTTP headers (not recommended based on cross-browser recognition tests):

    HTTP/1.1 200 OK
    Date: Wed, 05 Nov 2003 10:46:04 GMT
    Content-Type: text/html; charset=iso-8859-1
    Content-Language: en
    

Exit Chrome completely and restart it to ensure the change is detected. Chrome doesn't always pick up the new meta tag on tab refresh.

Victuals answered 28/6, 2010 at 8:4 Comment(19)
Here's a description of Google's meta tags: support.google.com/webmasters/bin/…Rockett
@RickM, no? This is from Google: "If you're a webmaster and would prefer your web page not be translated by Google Translate, just insert the following meta tag into your HTML file: <meta name="google" value="notranslate">" See: support.google.com/translateVictuals
Nope, definitely not working on both Windows and Mac versions of chrome. Seems that a number of people cant get it to work...seems a bit hit and miss!Opportunism
@Emile: It works, if you load the page in a new tab. It doesn't work if you just press F5 to refresh.Tayler
In html5 it should be content instead of value: <meta name="google" content="notranslate" />Iman
Setting the correct response headers would be preferable over http-equiv meta tags.Postorbital
@Jack, that's neither the recommendation of Google or the W3C. Although your challenge did turn up interesting info which called my answer into question: w3.org/International/tests/html-css/language-declarations/…Victuals
Interesting results, especially since the meaning of http-equiv should mean it work the same as http response header.Postorbital
Chrome seems to do whatever it wants. I can return txt files in english specifying that the are ASCII in the HTTP response headers, and even if the data only contains ASCII characters, chrome still does a frequency analysis on the bytes and prompts the user that it is in a different language.Gao
I'm having the same issue, even so <html lang="de"> is set, Chrome thinks it's English.Aerostation
Don't work on Chrome or Search, but I imagine the difficulty is that you are using meta tags supported by the latter to tell the former what to do. Interestingly, these folks are having the opposite problem.Legroom
All the above didn't work for me. I tried a lot of tricks but Chrome is really stubborn in trusting itself to detect the language correctly, no matter how many language attributes and headers you set. What helped was to add like 10 extra words in English and it stopped asking to translate my page from Portuguese to English.Elastic
Note that you have to hard reload (or close and reopen chrome) before the translate message will disappear; not just a normal reload.Rhizo
Just confirming, I had to close chrome entirely for this change to be picked up correctly. A new tab or page reload isn't enough.Perloff
<meta http-equiv="content-language" is now obsolete, according to developer.mozilla.org/en-US/docs/Web/HTML/Element/….Incursive
@Ja͢ck actually I'm pretty certain the standards bodies agree that HTTP headers are preferred over HTML meta tags in general. Not sure about the Translation toolbar though. But for certain things it's clear why the HTTP response headers are generally preferred. These work for all HTTP resources, not just HTML and they are not dependent on the response body. Think about the contradiction of reading content type and charset info from the content... you need to know charset to parse content to find charset.. which is why the meta tag for Content-Type must be near the top of the page.Coverlet
Jack, it was intended for @Kyle but I had two at mentions and only one was allowed and I removed the wrong one. Sorry.Coverlet
I had to put lang="en-US", lang="en" didn't stop it. I realized it was offering latin because I had a bunch of Lipsum in an example page.Midget
Updates to hack around chrome are normally outdated as fast as they are posted. Is there anything left in this post that still works?Oatmeal
M
17

I added lang="en" to the doctype declaration, added meta tags for charset utf-8 and Content-Langauge in the HTML header, specified charset as utf-8 and Content-Language as en in the HTTP response headers and it did nothing to stop Chrome from declaring my page was in Portuguese. The only thing that fixed the problem was adding this to the HTML header:

<meta name="google" content="notranslate">

But now I've prevented users from translating my page that is clearly in English to their own language. Poor job, Chrome. You can be better than this.

Marris answered 20/11, 2017 at 15:55 Comment(1)
So true! They say 'We don’t use any code-level language information such as lang attributes'. Yeah, because that would be weird. Instead, we use some secret/proprietary magic algorithm. When IE did this for determining Content-Type, we said they did not follow standards, but when we do it, suddenly it's great. Yay!Coverlet
A
5

Specify the default language for the document, then use the translate attribute and Google's notranslate class per element/container, as in:

<html lang="en">
    ...
    <span><a href="#" translate="no" class="notranslate">English</a></span>

Explanation:

The accepted answer presents a blanket solution, but does not address how to specify the language per element, which can fix the bug and ensure your page remains translatable.

Why is this better? This will cooperate with Google's internationalization versus shut it off. Referring back to the OP:

Why does Chrome incorrectly determine page is in a different language and offer to translate?

Answer: Google is trying to help you with internationalization, but we need to understand why this is failing. Building off of NinjaCat's answer, we assume that Google reads and predicts the language of your website using an N-gram algorithm -- so, we can't say exactly why Google wants to translate your page; we can only assume that:

  1. There are words on your page that belong to a different language.
  2. Marking the containing element as translate="no" and lang="en" (or removing these words) will help Google to correctly predict the language of your page.

Unfortunately, most people reaching this post won't know what words are causing the trouble. Use Chrome's built-in "Translate to English" feature (in the Right-Click context menu) to see what gets translated, you may see unexpected translations like the following:

enter image description here

So, update your html with the appropriate translation tags until the Google Translation of your page changes nothing -- then we should expect the popup to go away for future visitors.

Won't it be a lot of work to add all these extra tags? Yes, very likely. If you are using Wordpress or another Content Management System then look in their documentation for quick ways to update your code!

Achromatism answered 16/7, 2019 at 17:39 Comment(1)
This works for me, the meta tags were still allowing the translate popup.Hunsinger
R
2

Without knowing what the text was, perhaps the ngram detection is being tricked by the content of your page.

http://googleresearch.blogspot.com/2006/08/all-our-n-gram-are-belong-to-you.html

https://en.wikipedia.org/wiki/N-gram

Reachmedown answered 23/7, 2010 at 19:31 Comment(3)
But the question is, how can I debug it or get more info for Chrome to figure out exactly why it made the choice it did?Chickpea
Without seeing the text, I cannot say for sure. Some things to try: - If you copy the text and paste it into translate.google.com, and set it to "Detect Language", does it tell you that it's English or not? - If it says it's Danish or whatever, then I would start removing sentences until you find the troublemaker.Reachmedown
Hi Sam - That's in effect what I am suggesting. There's no way to ask it why it made the decision. There's some sentence or wording in your text that is tricking it (after all machine translation is not nearly perfect). In order to debug this thing I would take out sentence by sentence until it recognizes the correct language.Reachmedown
L
1

Chromium thinks this page in Filipino: http://www.reyalvarado.com/portfolio/cuba/ Notes: There is pretty much no text on the page except for the owner's name and the menu items. Menu items are dynamically replaced with images by FLIR.

The HTML declares the page as US English:

<html xmlns="http://www.w3.org/1999/xhtml" dir="ltr" lang="en-US"> 
Ligroin answered 8/5, 2010 at 21:25 Comment(3)
Yeah, I have the same issue. Not much text on the page, and the <html> element has lang="en" and xml:lang="en". Chrome ignores it!Rockett
@JoshuaDavis, I tried everything above lang attribute, meta tags (except the notranslate one). What finally fixed it for me was adding the dir="ltr" attribute.Wrongly
dir="ltr" is... direction, left to right I guess? Wow.Rockett
Y
0

Try including the property xml:lang="" to the <html>, if the other solutions don't work:

<html class="no-js" lang="pt-BR" dir="ltr" xml:lang="pt-BR">
Yates answered 7/3, 2012 at 20:23 Comment(2)
This approach didn't work for me. Chrome seems to ignore lang="..." and xml:lang="...".Rockett
This works at confusing chrome into not knowing what language the page is, so it won't offer a translation.Rhizo

© 2022 - 2024 — McMap. All rights reserved.