Consider:
int a = 42;
// Reference equality on two boxed ints with the same value
Console.WriteLine( (object)a == (object)a ); // False
// Same thing - listed only for clarity
Console.WriteLine(ReferenceEquals(a, a)); // False
Clearly, each boxing instruction allocates a separate instance of a boxed Int32
, which is why reference-equality between them fails. This page appears to indicate that this is specified behaviour:
The box instruction converts the 'raw' (unboxed) value type into an object reference (type O). This is accomplished by creating a new object and copying the data from the value type into the newly allocated object.
But why does this have to be the case?
Is there any compelling reason why the CLR does not choose to hold a "cache" of boxed Int32
s, or even stronger, common values for all primitive value-types (which are all immutable)? I know Java has something like this.
In the days of no-generics, wouldn't it have helped out a lot with reducing the memory requirements as well as GC workload for a large ArrayList
consisting mainly of small integers? I'm also sure that there exist several modern .NET applications that do use generics, but for whatever reason (reflection, interface assignments etc.), run up large boxing-allocations that could be massively reduced with (what appears to be) a simple optimization.
So what's the reason? Some performance implication I haven't considered (I doubt if testing that the item is in the cache etc. will result in a net performance loss, but what do I know)? Implementation difficulties? Issues with unsafe code? Breaking backwards compatibility (I can't think of any good reason why a well-written program should rely on the existing behaviour)? Or something else?
EDIT: What I was really suggesting was a static cache of "commonly-occurring" primitives, much like what Java does. For an example implementation, see Jon Skeet's answer. I understand that doing this for arbitrary, possibly mutable, value-types or dynamically "memoizing" instances at run-time is a completely different matter.
EDIT: Changed title for clarity.
Int32
have this "caching" behavior, or all primitive value types? What about user-defined value types? Probably "no" for the latter; does that suggest "no" for the former? – Roydd