So I've got some troubles with character encoding. When I put the following two characters into a UTF32 encoded text file:
𩸕
鸕
and then run this code on them:
System.IO.StreamReader streamReader =
new System.IO.StreamReader("input", System.Text.Encoding.UTF32, false);
System.IO.StreamWriter streamWriter =
new System.IO.StreamWriter("output", false, System.Text.Encoding.UTF32);
streamWriter.Write(streamReader.ReadToEnd());
streamWriter.Close();
streamReader.Close();
I get:
鸕
鸕
(same character twice, i.e the input file != output)
A few things that might help: Hex for the first character:
15 9E 02 00
And for the second:
15 9E 00 00
I am using gedit for the text file creation, mono for the C# and I'm using Ubuntu.
It also doesn't matter if I specify the encoding for the input or output file, it just doesn't like it if it's in UTF32 encoding. It works if the input file is in UTF-8 encoding.
The input file is as follows:
FF FE 00 00 15 9E 02 00 0A 00 00 00 15 9E 00 00 0A 00 00 00
Is it a bug, or is it just me?
Thanks!
streamReader.ReadToEnd()
. – EgostreamReader.ReadToEnd()
into a string, and then check that. It should be the UTF-16 encoded version of the input. – Martensinput
and save it, no problems, but my small annoying code just won't... – Newel