How do i get the numeric value of a unicode character in C#?
For example if tamil character அ
(U+0B85) given, output should be 2949
(i.e. 0x0B85
)
See also
- C++: How to get decimal value of a unicode character in c++
- Java: How can I get a Unicode character's code?
Multi code-point characters
Some characters require multiple code points. In this example, UTF-16, each code unit is still in the Basic Multilingual Plane:
- (i.e.
U+0072
U+0327
U+030C
) - (i.e.
U+0072
U+0338
U+0327
U+0316
U+0317
U+0300
U+0301
U+0302
U+0308
U+0360
)
The larger point being that one "character" can require more than 1 UTF-16 code unit, it can require more than 2 UTF-16 code units, it can require more than 3 UTF-16 code units.
The larger point being that one "character" can require dozens of unicode code points. In UTF-16 in C# that means more than 1 char
. One character can require 17 char
.
My question was about converting char
into a UTF-16 encoding value. Even if an entire string of 17 char
only represents one "character", i still want to know how to convert each UTF-16 unit into a numeric value.
e.g.
String s = "அ";
int i = Unicode(s[0]);
char
(orMyString[3]
, which is a char) – DisinclinationUnicode
function (msdn.microsoft.com/en-us/library/ms180059.aspx) – Disinclination