Updated question ¹
With regards to character classes, comparison, sorting, normalization and collations, what Unicode version or versions are supported by which .NET platforms?
Original question
I remember somewhat vaguely having read that .NET supported Unicode version 3.0 and that the internal UTF-16 encoding is not really UTF-16 but actually uses UCS-2, which is not the same. It seems, for instance, that characters above U+FFFF are not possible, i.e. consider:
string s = "\u1D7D9"; // ("Mathematical double-struck digit one")
and it stores the string "ᵽ9"
.
I'm basically looking for definitive references of answers to the following:
- If it isn't true UTF-16 in .NET, what is it?
- What version of Unicode is supported by .NET?
- If recent versions are not supported or planned in the near future, does anybody know of a (non)commercial library or how I can workaround this issue?
¹) I updated the question as with passing time, it seems more appropriate with respect to the answers and to the larger community. I left the original question in place of which parts have been answered in the comments. Also the old UCS-2 (no surrogates) was used in now-ancient 32 bit Windows versions, .NET has always used UTF-16 (with surrogates) internally.
\U
(never needed it before I guess) and then wrongly concluded that there was no support for higher planes. – Discontinuity