What is this char? 65279 ''
Asked Answered
D

5

69

I have two strings.

one is "\""

and the other is "\""

I think that they are same.

However, String.Compare says they are different.

This is very strange.

Here's my code:

string b = "\"";
string c = "\"";

if (string.Compare(b, c) == 0)
{
    Console.WriteLine("Good");
}

if (c.StartsWith("\""))
{
    Console.WriteLine("C");
}

if (b.StartsWith("\""))
{
    Console.WriteLine("B");
}

I expected that it may print "GoodCB".

However, it only prints "B".

In my debugger, c[0] is 65279 '' and c[1] is 34 '"'. and b[0] is '"'.

But I don't know what 65279 '' is.

Is it an empty character?

Delphinedelphinia answered 22/7, 2011 at 1:33 Comment(3)
What does your string come from? You're probably reading it wrong.Angus
It very commonly appears as the first character in a utf-16 encoded text file. Use StreamReader, not FileStream.Amalburga
This is very likely related to this excellent answer/explanation here (TL;DR use StreamReader if the string was loaded from a Stream, use Encoding.GetString() if it was loaded from Encoding.GetBytes(); do not mix the two): https://mcmap.net/q/93688/-encoding-utf8-getstring-doesn-39-t-take-into-account-the-preamble-bomPhosphorism
A
101

It's a zero-width no-break space.
It's more commonly used as a byte-order mark (BOM).

Angus answered 22/7, 2011 at 1:34 Comment(3)
How can I remove that char when I cannot sure whether it starts with '' or not?Whiggism
Thank you so much! I was hitting the wall till I found your solution!Woebegone
Great response, which lead me to continue looking and find this this excellent answer/explanation here (TL;DR use StreamReader if the string was loaded from a Stream, use Encoding.GetString() if it was loaded from Encoding.GetBytes(); do not mix the two): https://mcmap.net/q/93688/-encoding-utf8-getstring-doesn-39-t-take-into-account-the-preamble-bomPhosphorism
B
9

If you are using Notepad++, try converting to UTF-8 (no BOM), and also make sure ALL your files in the project are the same file system format.

Brumley answered 23/11, 2015 at 21:1 Comment(0)
B
5

You can remove it with:

Trim(new char[]{'\uFEFF','\u200B'});
Bookmark answered 29/6, 2021 at 12:3 Comment(0)
C
4

If you are reading from a file you have opened in notepad, it may have added it as it is one of several programs notorious for doing so.

Cariole answered 22/7, 2011 at 1:58 Comment(3)
How can I remove that char when I cannot sure whether it starts with '' or not.Whiggism
Notepad and other programs are saving UTF8 files, which is a valid and common format. The BOM only bothers you when you read the file with the wrong encoding.Angus
I want to determine whether '' is exist or not using c[0] == '' but I cannot build this.Whiggism
S
-1

It is byte order mark(BOM). A BOM is a special marker at the beginning of a file that indicates the byte order of the text data in the file.

We can remove the BOM in JavaScript using the following code

    function removeBOM(jsonString) {
        if (jsonString.charCodeAt(0) === 0xfeff) {
            jsonString = jsonString.slice(1);
        } 
        return jsonString;
    }
Stopwatch answered 13/3, 2023 at 9:0 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.