HTML5 Encoding & Cyrillic
Asked Answered
L

2

13

Something that made me curious - supposedly the default character encoding in HTML5 is UTF-8. However if I have a plain simple HTML file with an HTML5 doctype like the code below, I get:

"hello" in Russian: "ЗдраÑтвуйте"

In Chrome 33+, Safari 6, IE11, etc.

<!DOCTYPE html>

<html>

<head></head>

<body>
    <p>"hello" in Russian is "здраствуйте"</p>
</body>

</html>

What gives? Shouldn't the browser utilize the UTF-8 unicode standard and display the text correctly? I'm using Coda which is set to save html files with UTF-8 encoding by default so that's not the problem.

Landry answered 29/3, 2014 at 19:45 Comment(2)
you can save your file as anything you want - browser will not be on your system but on user and you never know what settings their browser have.Dicast
"hello" in Russian is "здраствуйте" it is wrong! "hello" in Russian is "здравствуйте"!Pinstripe
J
27

The text data in the example is UTF-8 encoded text misinterpreted as window-1252 encoded. The reason is that the encoding has not been specified and browsers are forced to make a guess. To fix this, specify the encoding; see the W3C page Character encodings. Two simple ways that work independently of server settings, as long as the server does not send wrong encoding information in HTTP headers:

1) Save the file as UTF-8 with BOM (there is probably an option for this in your authoring program.

2) Add the following tag into the head part:

<meta charset=utf-8>

There is no single default encoding specified for HTML5. On the contrary, browsers are expected to make guesses when no encoding has been declared. This is a fairly complex process, described in 8.2.2.2 Determining the character encoding.

Juggler answered 29/3, 2014 at 23:1 Comment(0)
D
10

If you want to be sure which charset will be used by browser you must have in your page head

 <meta content="text/html; charset=UTF-8" http-equiv="Content-Type">

otherwise you are at the mercy of local settings and browser automation.

Dicast answered 24/4, 2014 at 20:14 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.