utf16 or utf32? I'm trying to store content in a lot of languages. Some of the languages use double-wide fonts (for example, Japanese fonts are frequently twice as wide as English fonts). I'm not sure which kind of database I should be using. Any information about the differences between these four charsets...
MySQL's utf32
and utf8mb4
(as well as standard UTF-8) can directly store any character specified by Unicode; the former is fixed size at 4 bytes per character whereas the latter is between 1 and 4 bytes per character.
utf8mb3
and the original utf8
can only store the first 65,536 codepoints, which will cover CJVK (Chinese, Japanese, Vietnam, Korean), and use 1 to 3 bytes per character.
utf16
uses 2 bytes for the first 65,536 codepoints, and 4 bytes for everything else.
As for fonts, that's strictly a visual thing.
See also MySQL documentation for Unicode support.
utf8
and utf8mb3
do not cover all CJK characters, some of which are 4-byte wide. –
Remnant
utf8mb4
is the best.
utf8mb4
supports 4 bytes per character compared to utf8's 3 bytes per character, so it covers a wider range of uses without error.
With utf8mb4
you can support emojis, for example. If you try to insert an emoji in an unsupported character set you will get errors.
utf8mb4
is the more modern version of the 2 and will replace the older version eventually.
© 2022 - 2024 — McMap. All rights reserved.
utf8_general
applies to all the otherutf8_*
collations too; all will be using MySQL'sutf8mb3
akautf8
charset. – Broadax