How to display all non-English characters correctly in a web site?
Asked Answered
M

1

7

It's annoying to see even the most professional sites do it wrong. Posted text turns into something that's unreadable. I don't have much information about encodings. I just want to know about the problem that's making such a basic thing so hard.

  • Does HTTP encoding limit some characters?
  • Do users need to send info about the charset/encoding they are using?
  • Assuming everything arrives to the server as it is, is encoding used saving that text causing the problem?
  • Is it something about browser implementations?
  • Do we need some JavaScript tricks to make it work?

Is there an absolute solution to this? It may have its limits but StackOverflow seems to make it work.

Mignon answered 19/4, 2011 at 19:50 Comment(3)
Can you provide examples that you consider the correct and wrong way?Bureau
I hope it's more specific now.Unloosen
Do you have a link? This could be as simple as a missing or wrong content-type.Ascribe
R
8

I suspect one needs to make sure that the whole stack handles the encoding with care:

  • Specify a web page font (CSS) that supports a wide range of international characters.
  • Specify a correct lang/charset HTML tag attributes and make sure that the Browser is using the correct encoding.
  • Make sure the HTTP requests are send with the appropriate charset specified in the headers.
  • Make sure the content of the HTTP requests is decoded properly in your web request handler
  • Configure your database/datastore with a internationalization-friendly encoding/Collation (such as UTF-9/UTF-16) and not one that just supports latin characters (default in some DBs).

The first few are normally handled by the browser and web framework of choice, but if you screw up the DB encoding or use a font with limited character set there will be no one to save you.

Reposeful answered 19/4, 2011 at 21:4 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.