Why is boxing a primitive value-type in .NET uncached, unlike Java?
Asked Answered
R

6

11

Consider:

int a = 42;

// Reference equality on two boxed ints with the same value
Console.WriteLine( (object)a == (object)a ); // False

// Same thing - listed only for clarity
Console.WriteLine(ReferenceEquals(a, a));  // False

Clearly, each boxing instruction allocates a separate instance of a boxed Int32, which is why reference-equality between them fails. This page appears to indicate that this is specified behaviour:

The box instruction converts the 'raw' (unboxed) value type into an object reference (type O). This is accomplished by creating a new object and copying the data from the value type into the newly allocated object.

But why does this have to be the case? Is there any compelling reason why the CLR does not choose to hold a "cache" of boxed Int32s, or even stronger, common values for all primitive value-types (which are all immutable)? I know Java has something like this.

In the days of no-generics, wouldn't it have helped out a lot with reducing the memory requirements as well as GC workload for a large ArrayListconsisting mainly of small integers? I'm also sure that there exist several modern .NET applications that do use generics, but for whatever reason (reflection, interface assignments etc.), run up large boxing-allocations that could be massively reduced with (what appears to be) a simple optimization.

So what's the reason? Some performance implication I haven't considered (I doubt if testing that the item is in the cache etc. will result in a net performance loss, but what do I know)? Implementation difficulties? Issues with unsafe code? Breaking backwards compatibility (I can't think of any good reason why a well-written program should rely on the existing behaviour)? Or something else?

EDIT: What I was really suggesting was a static cache of "commonly-occurring" primitives, much like what Java does. For an example implementation, see Jon Skeet's answer. I understand that doing this for arbitrary, possibly mutable, value-types or dynamically "memoizing" instances at run-time is a completely different matter.

EDIT: Changed title for clarity.

Relique answered 23/11, 2010 at 14:8 Comment(2)
Should only Int32 have this "caching" behavior, or all primitive value types? What about user-defined value types? Probably "no" for the latter; does that suggest "no" for the former?Roydd
As I mention, I would think that it should be for "common values for all primitive value-types". I haven't thought it through fully, and I don't have the answers, that's why I asked the question. :)Relique
S
11

One reason which I find compelling is consistency. As you say, Java does cache boxed values in a certain range... which means it's all too easy to write code which works for a while:

// Passes in all my tests. Shame it fails if they're > 127...
if (value1 == value2) {
    // Do something
}

I've been bitten by this - admittedly in a test rather than production code, fortunately, but it's still nasty to have something which changes behaviour significantly outside a given range.

Don't forget that any conditional behaviour also incurs a cost on all boxing operations - so in cases where it wouldn't use the cache, you'd actually find that it was slower (because it would first have to check whether or not to use the cache).

If you really want to write your own caching box operation, of course, you can do so:

public static class Int32Extensions
{
    private static readonly object[] BoxedIntegers = CreateCache();

    private static object[] CreateCache()
    {
        object[] ret = new object[256];
        for (int i = -128; i < 128; i++)
        {
            ret[i + 128] = i;
        }
    }

    public object Box(this int i)
    {
        return (i >= -128 && i < 128) ? BoxedIntegers[i + 128] : (object) i;
    }
}

Then use it like this:

object y = 100.Box();
object z = 100.Box();

if (y == z)
{
    // Cache is working
}
Sarcasm answered 23/11, 2010 at 14:11 Comment(17)
Regarding consistency, the related ReferenceEquals(someString, someOtherStringWithSameChars) is already inconsistent, so interning / caching can already bite you. I can't imagine this reason is strong enough to make up for the downsides, like the ArrayList or HashTable footprints for value-types.Relique
@Ani: Strings don't just magically come into existence from value types with no visible operation, unlike boxes.Sarcasm
I agree that that the test would make things slower for a cache miss. But I would wager that most boxed primitives in real-world apps would fall into a tight-range around zero. In any case, couldn't we get 100% coverage on byte and sbyte, maybe even short?Relique
Would you agree that a large proportion of boxing in most apps happen inside BCL / third-party code / "silently"? To that extent, I would think that implementing a custom-cache, while useful, seems like a weak workaround.Relique
@Ani: They currently happen silently - although not necessarily in BCL or third-party code. I really can't get excited by this optimization - it feels like it's adding complexity without any evidence that it will have a meaningful impact on performance. Of course, if you want to profile some significant apps, make changes to explicitly use a cache, and then argue from data, that's a different matter. Until then, I'll favour consistency :) You may not buy it as a reason, but as someone who's been bitten by Java's behaviour, it makes sense to me :)Sarcasm
Out of interest, what are the scenarios when reference-equality on value-types are important to an app? Not saying you're wrong, but I can't think of any. Don't tell me people use a reference to a boxed int for locking :)Relique
@Ani: Accident, most likely. I prefer it when bad code doesn't sometimes work... which is the case in Java. (Yes, it will still work when you copy the actual reference in C#, but not when you box twice.)Sarcasm
I still don't understand. When is it useful to do a ref-equals on two boxed ints? If one did that, how would always true or always false be "better" behaviour than undefined?Relique
In the case where you're accidentally write code which uses reference equality but want value equality. Then it will work with simple test-cases (i.e. those that hit the cache) as if you had implemented value equality, and then once you get to cases outside of the cache it suddenly fails.Multiplechoice
And the benefit of this optimization is much smaller in .net than in java since the .net generics don't box.Multiplechoice
"Would you agree that a large proportion of boxing in most apps happen inside BCL" I don't agree with that. It occurs when calling APIs that take object but where you supply the value-type. This is usually the case where your own code calls BCL code, and not inside BCL code itself. Or are there any important cases I'm missing?Multiplechoice
@Ani: "If one did that, how would always true or always false be "better" behaviour than undefined?" Currently the behaviour is always false, assuming separate boxing operations on both sides. You're the one suggesting that it should sometimes return true, sometimes false.Sarcasm
Yes, that's right. What I'm asking for is a plausible scenario for doing ref-equals on boxed value-types.When would you really need to do that? It's an already dangerous thing to do semantically, so a little bit of weird behaviour on top of that doesn't make things much worse, IMO. If "by accident" you mean that one thinks they're performing value-equality when they actually aren't, they're in pretty deep trouble already. For example, you can "luck out" and have two identical references to the same boxed structure. So when should you be doing ref-equals on 2 boxed value-types?Relique
@CodeInChaos: That wasn't my full-sentence. I said BCL / third-party apps / silently (autoboxing etc.)Relique
I only disagreed with that part of the sentence. That badly written third party code can do bad stuff is obvious. And auto-boxing is one of the features of C# I don't really like.Multiplechoice
@Jon Skeet: Nobody has stepped forward with a better reason, so accepted...grudgingly :)Relique
@JonSkeet some measurements for you github.com/benaadams/PerfTesting/tree/master/src/BoxingTestMost
C
3

I can't claim to be able to read minds, but here's a couple factors:

1) caching the value types can make for unpredictability - comparing two boxed values that are equal could be true or false depending on cache hits and implementation. Ouch!

2) The lifetime of a boxed value type is most likely short - so how long do you hold the value in cache? Now you either have a lot of cached values that will no longer be used, or you need to make the GC implementation more complicated to track the lifetime of cached value types.

With these downsides, what is the potential win? Smaller memory footprint in an application that does a lot of long-lived boxing of equal value types. Since this win is something that is going to affect a small number of applications and can be worked around by changing code, I'm going to agree with the c# spec writer's decisions here.

Cutright answered 23/11, 2010 at 14:20 Comment(7)
The lifetime issue is an interesting one. I'm sure there could be some really smart solutions, but why not just cache small values forever?Relique
@Relique Say I had a loop that boxed all the check numbers of checks (N checks) I'd recieved, and it boxes each one twice. With caching this saves N x size of boxed value memory. But if the loop only occurs once in a server application (at startup), i've just eaten up N x size of boxed value memory. If I'm a business and have received 100k checks to date, do I want to save that much memory temporarily only to have the same amount permanently allocated? How much of it is paged in when I need to check for a cache hit later?Cutright
By small values, I'm talking about a "static" cache like Java. E.g. -128 to 127. The memory requirements for this are negligible, at least on a desktop. Dynamic caching would need to be a lot smarter; maybe that would would be overkill.Relique
@Relique I see what you mean now, but I'm not sure the added complexity actually buys us anything. In the rare cases where this caching makes a significant difference, we can change the code to not do so much boxing. I don't think the added complexity and the fact that values over 128 behave differently than values under 128 is outweighed by the benefit.Cutright
@Ani: a cache that caches forever is called a "memory leak".Cowled
@Eric Lippert: I repeat that I'm talking about a static cache of "small" values, E.g. -128 to 127. Why would you characterize that as a memory-leak?Relique
@EricLippert: I would posit that the term "memory leak" only applies in situations where the extra memory requirement is potentially unbounded. If code had an Object[65536] that were used to hold integers, and cached each value on first use, the maximum memory waste would be never exceed 1/2 megs on a 32/64 bit system.Rilke
G
3

Boxed value objects are not necessarily immutable. It is possible to change the value in a boxed value type, such as through an interface.

So if boxing a value type always returned the same instance based on the same original value, it would create references which may not be appropriate (for example, two different value type instances which happen to have the same value end up with the same reference even though they should not).

public interface IBoxed
{
    int X { get; set; }
    int Y { get; set; }
}

public struct BoxMe : IBoxed
{
    public int X { get; set; }

    public int Y { get; set; }
}

public static void Test()
{
    BoxMe original = new BoxMe()
                        {
                            X = 1,
                            Y = 2
                        };
    
    object boxed1 = (object) original;
    object boxed2 = (object) original;

    ((IBoxed) boxed1).X = 3;
    ((IBoxed) boxed1).Y = 4;

    Console.WriteLine("original.X = " + original.X);
    Console.WriteLine("original.Y = " + original.Y);
    Console.WriteLine("boxed1.X = " + ((IBoxed)boxed1).X);
    Console.WriteLine("boxed1.Y = " + ((IBoxed)boxed1).Y);
    Console.WriteLine("boxed2.X = " + ((IBoxed)boxed2).X);
    Console.WriteLine("boxed2.Y = " + ((IBoxed)boxed2).Y);
}

Produces this output:

original.X = 1

original.Y = 2

boxed1.X = 3

boxed1.Y = 4

boxed2.X = 1

boxed2.Y = 2

If boxing didn't create a new instance, then boxed1 and boxed2 would have the same values, which would be inappropriate if they were created from different original value type instance.

Gribble answered 23/11, 2010 at 14:22 Comment(2)
Fair enough, but I would imagine that this feature would mainly target primitives.Relique
@Ani: All boxed value types, even primitives, are mutable. Even if a value does not allow any mechanism for mutation, it will be possible to replace the content (primitive value, or the contents of all public and private struct fields) of a boxed value type with the content of another instance of that same type.Rilke
P
1

There's an easy explanation for this: un/boxing is fast. It needed to be back in the .NET 1.x days. After the JIT compiler generates the machine code for it, there's but a handful of CPU instructions generated for it, all inline without method calls. Not counting corner cases like nullable types and large structs.

The effort of looking up a cached value would greatly diminish the speed of this code.

Personify answered 27/11, 2010 at 10:20 Comment(2)
+1: Thanks, some valid points. A few thoughts: 1. What I suggest has no impact on unboxing whatsoever. 2. The cache lookup is just an array bounds check followed possibly by array-retrieval by index. Is it really that expensive? Wouldn't heap allocation + future GC cost probably be more expensive (all assuming that the cache hits more often than not in the cached scheme). 3. Finally, this isn't only about speed. It's also about space. Wouldn't this optimization have reduced memory usage massively in many .NET 1.x apps? -- To sum up, isn't this likely to be a net performance win? Thoughts?Relique
Speed is everything, .NET 1.0 had to compete with unmanaged compilers and win the heart and minds of programmers that were used to C compiled code perf. I vividly remember looking at an early Java release and dismissing it outright, unfit to do anything but a scripting job. Yes, expensive in boxing only. But way more expensive, you can't cache every possible value type value. The branch mispredictions the cpu suffers on testing the value is redrum on perf. Trading memory for speed is a very common outcome.Personify
R
0

I wouldn't think a run-time-filled cache would be a good idea, but I would think it might be reasonable on 64-bit systems, to define ~8 billion of the 64 quintillion possible objects-reference values as being integer or float literals, and on any system pre-box all primitive literals. Testing whether the upper 31 bits of a reference type hold some value should probably be cheaper than a memory reference.

Rilke answered 27/1, 2011 at 16:5 Comment(1)
The problem with that theory, as I note in my answer, is that having object references sometimes be memory addresses and sometimes be something else would impose a speed penalty on very common operations, in exchange for what would at best be a slight speed up on far less common operations. The availability of generics means that boxed value-type primitives are used comparatively rarely, so speeding up operations with them would yield comparatively little benefit.Rilke
R
0

Adding to the answers already listed is the fact that in .net, at least with the normal garbage collector, object references are internally stored as direct pointers. This means that when a garbage collection is performed the system has to update every single reference to every object that gets moved, but it also means that "main-line" operation can be very fast. If object references were sometimes direct pointers and sometimes something else, this would require extra code every time an object is dereferenced. Since object dereferencing is one of the most common operations during the execution of a .net program, even a 5% slowdown here would be devastating unless it was matched by an awesome speedup. It's possible, for example, a "64-bit compact" model, in which each object reference was a 32-bit index into an object table, might offer better performance than the existing model in which each reference is a 64-bit direct pointer. Deferencing operations would require an extra table lookup, which would be bad, but object references would be smaller, thus allowing more of them to be stored in the cache at once. In some circumstances, that could be a major performance win (maybe often enough to be worthwhile--maybe not). It's unclear, though, that allowing an object reference to sometimes be a direct memory pointer and sometimes be something else would really offer much advantage.

Rilke answered 27/2, 2012 at 0:31 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.