Which collation to use so that `ş` and `s` are treated as unique values?
Asked Answered
O

1

0

The issue is that ş and s are interpreted by MySQL as identical values.

I'm new to MySQL, so I have no idea which collations would view them as unique.

The collations that I've tried using which don't work are:

  1. utf8_general_ci
  2. utf8_unicode_520_ci
  3. utf8mb4_unicode_ci
  4. utf8mb4_unicode_520_ci

Does anybody know which collation to use?

P.S. I also really need the collation to interpret emojis and other non-Latin characters, and, to my knowledge of MySQL and collations, the only collation able to do this is unicode?

Outdate answered 8/11, 2018 at 23:52 Comment(5)
those two characters are considered equivalent, at least by Unicode's collation standard. unicode.org/reports/tr15Marathi
@Marathi i know that because that's the very issue i'm facing right now lol. thanks nonetheless.Outdate
I'm saying there wouldn't be a unicode conforming collation that would treat those characters as unique.Marathi
@Marathi ahh i see what youre saying now. my buddy just actually said unicode_bin will work since it doesn't strip accentsOutdate
That makes sense, since unicode_bin only treats characters as code points. But note that that isn't what collation actually is. That the mostly common mistake of how collation works. unicode.org/faq/collation.htmlMarathi
O
1

utf8_turkish_ci and utf8_romanian_ci -- as shown in http://mysql.rjweb.org/utf8_collations.html

(Plus, of course, utf8_bin.)

For your added question: You are looking for a "character set" (not a "collation") that can represent Emoji and other non-Latin characters -- UTF-8 is the one to use. In MySQL, it is utf8mb4. The "collations" that are associated with that are named utf8mb4_.... Collations control ordering and equality, as indicated in the first part of your question about s and ş.

MySQL's CHARACTER SET utf8 is a subset of utf8mb4. Either can handle all the "letters" in the world. But only utf8mb4 can handle Emoji and some Chinese characters.

Oliviero answered 9/11, 2018 at 1:7 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.