I know what string interning is, and why the following code behaves the way it does:
var hello = "Hello";
var he_llo = "He" + "llo";
var b = ReferenceEquals(hello, he_llo); //true
Or
var hello = "Hello";
var h_e_l_l_o = new string(new char[] { 'H', 'e', 'l', 'l', 'o' });
var b = ReferenceEquals(hello, he_llo); //false
...or I thought I did, because a subtle bug has cropped up in some code I'm working on due to this:
var s = "";
var sss = new string(new char[] { });
var b = ReferenceEquals(s, sss); //True!?
How does the compiler know that sss
will in fact be an empty string?
string
constructor forchar[]
has exceptional logic for this in the CLR internally, and will simply point to the one, true, empty string if you pass an empty array rather than actually construct a new object. There is a question on SO (with a bad title) that explains it. To be clear, this is a runtime issue -- the surprise is not that the compiler is clairvoyant but thatnew
doesn't alwaysnew
. – Addendums
at runtime (such thats.Length == 0
) for whichObject.ReferenceEquals(s, "")
does not hold? If there is, I haven't found it -- creating one by manipulating an initially non-empty string doesn't seem to do it, no matter how clever you get. – Addendumnewobj
opcode creates "a new object or a new instance of a value type". Nowhere does it say that the runtime is allowed to return a reference to an existing instance in this case, but this is exactly what the CLR does anyway. It wouldn't be so bad if this wasn't an observable difference, but it is. I'd be tempted to call it a bug, except the behavior is so old (and the optimization demonstrably useful) that it's more of a quirk. – Addendumnewobj
instruction allocates a new instance of the class associated with ctor and initializes all the fields in the new instance to 0 (of the proper type) ornull
as appropriate. It then calls the constructor with the given arguments along with the newly created instance. After the constructor has been called, the now initialized object reference is pushed on the stack." First of all, this is obviously not what literally happens forstring
(just in effect), but even here, I would never expect the same reference to be returned twice based on this description! – Addendum