Why do these two string comparisons return different results?
Asked Answered
H

2

35

Here is a small piece of code :

String a = "abc";

Console.WriteLine(((object)a) == ("ab" + "c")); // true 
Console.WriteLine(((object)a) == ("ab" + 'c')); // false 

Why ?

Hetman answered 8/4, 2015 at 13:40 Comment(3)
"ab" + 'c' are two different object types, it evaluates to ab.Hasp
Sure you can add a string and a character, it results in the string "abc".Crocodilian
See ericlippert.com/2009/09/28/string-interning-and-string-empty for my article about some issues closely related to phenomenon you've noticed.Mafaldamafeking
I
75

Because the == is doing a reference comparison. With the C# compiler all the "equal" strings that are known at compile time are "grouped" together, so that

string a = "abc";
string b = "abc";

will point to the same "abc" string. So they will be referentially equal.

Now, ("ab" + "c") is simplified at compile time to "abc", while "ab" + 'c' is not, and so is not referentially equal (the concatenation operation is done at runtime).

See the decompiled code here

I'll add that the Try Roslyn is doing a wrong decompilation :-) And even IlSpy :-(

It is decompiling to:

string expr_05 = "abc"
Console.WriteLine(expr_05 == "abc");
Console.WriteLine(expr_05 == "ab" + 'c');

So string comparison. But at least the fact that some strings are calculated at compile time can be clearly seen.

Why is your code doing reference comparison? Because you are casting one of the two members to object, and the operator== in .NET isn't virtual, so it must be resolved at compile time with the information the compiler has, and then... from == Operator

For predefined value types, the equality operator (==) returns true if the values of its operands are equal, false otherwise. For reference types other than string, == returns true if its two operands refer to the same object. For the string type, == compares the values of the strings.

To the compiler, the first operand of the == operator isn't a string (because you casted it), so it doesn't fall in the string comparison.

Interesting fact: at the CIL level (the assembly language of .NET), the opcode used is the ceq, that does value comparison for primitive value types and reference comparison for reference types (so in the end it always does bit-by-bit comparison, with some exceptions for the float types with NaN). It doesn't use "special" operator== methods. It can be seen in this example

where the

Console.WriteLine(a == ("ab" + 'c')); // True 

is resolved at compile time in a call to

call bool [mscorlib]System.String::op_Equality(string, string)

while the other == are simply

ceq

This explains why the Roslyn decompiler works "badly" (as the IlSpy :-(, see bug report )... It sees an opcode ceq and doesn't check if there is a cast needed to rebuild the correct comparison.

Holger asked why only the addition between two string literals is done by the compiler... Now, reading the C# 5.0 specifications in a very strict way, and considering the C# 5.0 specifications to be "separated" from the .NET specifications (with the exceptions of the prerequisites that the C# 5.0 has for some classes/structs/methods/properties/...), we have:

String concatenation:

string operator +(string x, string y);
string operator +(string x, object y);
string operator +(object x, string y);

These overloads of the binary + operator perform string concatenation. If an operand of string concatenation is null, an empty string is substituted. Otherwise, any non-string argument is converted to its string representation by invoking the virtual ToString method inherited from type object. If ToString returns null, an empty string is substituted.

So, the case string + string, string + null, null + string are all precisely described, and their result can be "calculated" by using only the rules of the C# specifications. For every other type, the virtual ToString method must be called. The result of the virtual ToString method isn't defined for any type in the C# specifications, so if the compiler "presumed" its result it would do a wrong "thing". For example a .NET version that had System.Boolean.ToString() that returned Yes/No instead of True/False would still be OK for the C# specifications.

Iolaiolande answered 8/4, 2015 at 13:46 Comment(4)
You should mention why a reference comparison is done too to make this answer perfect.Crocodilian
Could be useful to mention that if you cast a into a string, both console outputs are trueThrombo
The only question left is why "ab"+"c" is solved at compile time and "ab"+'c' not. E.g. in Java, both are compile-time constants so why not in C#…Tagmemic
@Tagmemic They probably didn't think of it. From some tests I did, I think they only optimized string + null and string + string. string + int isn't optimized. Nor is string + bool. Note that String.Empty is a readonly field, so that too isn't optimized if used... "A" + String.Empty is done at runtime.Iolaiolande
S
-3

address not same. if you want to compare a string character,suggest to use equals.

Senior answered 9/4, 2015 at 8:40 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.