Why is IE failing to show UTF-8 encoded text?
Asked Answered
O

5

27

I have a some Chinese characters that I'm trying to display on a Kentico-powered website. This text is copy/pasted into Kenticos FCK editor, and is then saved and appears on the site. In Firefox, Chrome, and Safari, the characters appear exactly as expected. In IE 8 Standards mode, I see only boxes.

The text is UTF-8 encoded, and as far as I can tell, it is encoded correctly in the response from the server. There is a Content-Type: text/html; charset=utf-8 response header, and a <meta http-equiv="content-type" content="text/html; charset=UTF-8" /> meta tag on the page too. When I download the HTML from the server and compare the bytes of the characters in question to the original UTF-8 text document, the bytes all match, except the HTML does not include a BOM.

This seems to be specific to IE 8 in Standards mode. In IE 8 Quriks: it works. IE 7 Standards: it works. IE 7 Quirks: Works. I'm not sure how standards mode would cause this problem.

Strangely, if I view-source from IE, the characters show up in the source view correctly.

Any suggestions on what might be wrong here? Am I missing something obvious?

Oligarch answered 13/8, 2010 at 17:42 Comment(1)
We had an issue with IE11 not showing UTF-8 icons sometimes and I found this question in my hunt for a solution, but my issue was actually caused by no-store and no-cache headers as described in this Font Awesome troubleshooting page. Just in case anyone else finds themselves here with the same problem.Vasiliki
S
11

I can't explain this in detail. But this is indeed a known problem.

Here's a small reproducible code snippet:

<!DOCTYPE html>
<html lang="en">
    <head><title>test</title></head>
    <body><p>&#65185;<br>0 0</p></body>
</html>

Save it in UTF-8 and view in IE8. You see nothing. Replace 0 0 by 00 and reload the page. It'll work fine! This is absolutely astonishing. Weirdly, replacing 0 0 by a a or the <br> by a </p><p> will fix it as well. It'll have something to do with failures in whitespace rendering.

Sorry, I don't have authorative resources proving this, but this is just another evidence IE8 isn't as good as we expect it is. Your best bet is to try to change the HTML and/or build it step by step so that it works at some point or when in vain, add the following meta tag to the head to force IE8 into IE7 mode:

<meta http-equiv="X-UA-Compatible" content="IE=7" />
Sommers answered 13/8, 2010 at 18:13 Comment(4)
That's crazy! But, forcing IE 7 compatibility mode indeed works on the public side. Now I just have to figure out how to force compatibility mode inside the FCK Editor iframe so the user can actually edit the text. Thanks for the info!Oligarch
Cheers! Much luck with this weird problem.Sommers
I've just wasted 4 hours on this one to come to a similar solution (doh!). Rather than fix the compat mode to IE7 though (as IE9 is out) I simply set it to "IE=Edge" which is the same as saying "use the latest you know about". This seemed to force IE9 to recognise the fact that it should be in UTF-8, most odd but it works. Here for reference if others get it too.Dustpan
f*** me. I despise IE8Dogcatcher
L
8

The default IE encoding is Western European (ISO) so you need to change it manually to UTF-8 or enforce IE to use a given encoding like this:

  • HTML 4.01

    <meta http-equiv="content-type" content="text/html; charset=UTF-8">

  • HTML 5

    <meta charset="UTF-8">

And you also need to use lang attribute in <html> tag to declare language

    <html lang="zh">

for Chinese

Laure answered 13/2, 2013 at 10:30 Comment(1)
This works perfectly! Adding the language attribute to the html tag did the trick.Eckblad
Y
3

Just a wild guess, but it might be a font issue. Maybe the fonts available to your browser can' represent said Chinese characters.

Yippie answered 13/8, 2010 at 17:50 Comment(0)
D
2

I managed to fix the same issue by changing the file's UTF format to "UTF8 With Byte Order Mark".

(The editor I use allows me to switch file formats easily, not sure how to proceed otherwise, but worth taking a look at the different UTF file formats, IE(8) simply doesn't like UTF8 Without Byte Order Marks...)

I was also able to reproduce the snippet from the answer above;

<!DOCTYPE html>
<html lang="en">
    <head><title>test</title></head>
    <body><p>&#65185;<br>0 0</p></body>
</html>

But my results were "intermittent" while in UTF-Without BOM (sometimes accents would show up, some other times the weird chars, and it didn't look like a whitespace rendering issue to me...) Note that I was fiddling with lang="fr" and lang="es", but in all cases, changing the UTF file format seems to have permanently resolved my accents display issues. :)

I'm not 100% familiar with UTF, but if the chars are coded using 2 bytes, one would have to assume that white-space issues and misunderstood chars could be related to misaligned bytes in the sources.

Donal answered 26/7, 2013 at 18:52 Comment(0)
A
0

This may be the same kind of thing that caused Rails 3 to add a snowman character to their output: What is the _snowman param in Ruby on Rails 3 forms for?

Anthraquinone answered 13/8, 2010 at 17:52 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.