How is the boxing/unboxing behavior of Nullable<T> possible?
Asked Answered
G

3

41

Something just occurred to me earlier today that has got me scratching my head.

Any variable of type Nullable<T> can be assigned to null. For instance:

int? i = null;

At first I couldn't see how this would be possible without somehow defining an implicit conversion from object to Nullable<T>:

public static implicit operator Nullable<T>(object box);

But the above operator clearly does not exist, as if it did then the following would also have to be legal, at least at compile-time (which it isn't):

int? i = new object();

Then I realized that perhaps the Nullable<T> type could define an implicit conversion to some arbitrary reference type that can never be instantiated, like this:

public abstract class DummyBox
{
    private DummyBox()
    { }
}

public struct Nullable<T> where T : struct
{
    public static implicit operator Nullable<T>(DummyBox box)
    {
        if (box == null)
        {
            return new Nullable<T>();
        }

        // This should never be possible, as a DummyBox cannot be instantiated.
        throw new InvalidCastException();
    }
}

However, this does not explain what occurred to me next: if the HasValue property is false for any Nullable<T> value, then that value will be boxed as null:

int? i = new int?();
object x = i; // Now x is null.

Furthermore, if HasValue is true, then the value will be boxed as a T rather than a T?:

int? i = 5;
object x = i; // Now x is a boxed int, NOT a boxed Nullable<int>.

But this seems to imply that there is a custom implicit conversion from Nullable<T> to object:

public static implicit operator object(Nullable<T> value);

This is clearly not the case as object is a base class for all types, and user-defined implicit conversions to/from base types are illegal (as well they should be).

It seems that object x = i; should box i like any other value type, so that x.GetType() would yield the same result as typeof(int?) (rather than throw a NullReferenceException).

So I dug around a bit and, sure enough, it turns out this behavior is specific to the Nullable<T> type, specially defined in both the C# and VB.NET specifications, and not reproducible in any user-defined struct (C#) or Structure (VB.NET).

Here's why I'm still confused.

This particular boxing and unboxing behavior appears to be impossible to implement by hand. It only works because both C# and VB.NET give special treatment to the Nullable<T> type.

  1. Isn't it theoretically possible that a different CLI-based language could exist where Nullable<T> weren't given this special treatment? And wouldn't the Nullable<T> type therefore exhibit different behavior in different languages?

  2. How do C# and VB.NET achieve this behavior? Is it supported by the CLR? (That is, does the CLR allow a type to somehow "override" the manner in which it is boxed, even though C# and VB.NET themselves prohibit it?)

  3. Is it even possible (in C# or VB.NET) to box a Nullable<T> as object?

Geisler answered 23/9, 2010 at 5:12 Comment(2)
It is the JIT compiler that implements the behavior. More here: #1583550Paleography
I realize I'm 7 years late to the party. But, I would like to suggest anyone curious like me also read the reference source for more insight. referencesource.microsoft.com/#mscorlib/system/…Flush
W
52

There are two things going on:

1) The compiler treats "null" not as a null reference but as a null value... the null value for whatever type it needs to convert to. In the case of a Nullable<T> it's just the value which has False for the HasValue field/property. So if you have a variable of type int?, it's quite possible for the value of that variable to be null - you just need to change your understanding of what null means a little bit.

2) Boxing nullable types gets special treatment by the CLR itself. This is relevant in your second example:

    int? i = new int?();
    object x = i;

the compiler will box any nullable type value differently to non-nullable type values. If the value isn't null, the result will be the same as boxing the same value as a non-nullable type value - so an int? with value 5 gets boxed in the same way as an int with value 5 - the "nullability" is lost. However, the null value of a nullable type is boxed to just the null reference, rather than creating an object at all.

This was introduced late in the CLR v2 cycle, at the request of the community.

It means there's no such thing as a "boxed nullable-value-type value".

Winer answered 23/9, 2010 at 5:28 Comment(4)
Spot on as usual - looking IL code, object x = i gets converted into box instruction indicating support at CLR level.Gid
I see what you mean about boxing getting special treatment from the CLR: I just wrote a quick test and looked at the IL in Reflector and noticed that the C# compiler doesn't appear to do anything special to remove the boxing of Nullable<T> itself. But then isn't it strange that the effect of boxing of a Nullable<T> is specified independently in the C# and VB.NET specs? Is it in the CLI spec as well, do you know?Geisler
@Dan: Yes, it's in the C# spec. I don't know if it's in the VB.NET spec. In the CLI spec (ECMA-335) it's in section 8.2.4 of partition 1.Winer
I was able to find it in the VB spec as well (in section 8.6.1, at least for version 9.0 of the specification). I guess I'm just confused as to why this would be specified in three places--the CLI spec, as well as the C# and VB specs--instead of simply in the CLI spec alone.Geisler
S
3

You got it right: Nullable<T> gets special treatment from the compiler, both in VB and C#. Therefore:

  1. Yes. The language compiler needs to special-case Nullable<T>.
  2. The compiler refactors usage of Nullable<T>. The operators are just syntactic sugar.
  3. Not that I know of.
Soft answered 23/9, 2010 at 5:15 Comment(4)
So in response to your 2nd answer: the C# compiler (for instance) must refactor object x = i; to something like object x = i.HasValue ? (object)i.Value : null; am I right? Which means the language is actually disallowing standard boxing behavior altogether (I just realized that object x = (int?)5; makes x a boxed int, not Nullable<int>). Interesting...Geisler
Looking at IL, CLR has special support to Nullable. Complier emits box instruction and CLR checks value to decide if to set ref to null value or box actual value type value.Gid
From looking at IL generated from a GetRandomNullable<T> method I just whipped up myself in Reflector, it appears VinayC is right: I don't see the C# compiler actually refactoring the boxing, which implies (to me) that the special treatment actually does occur at the CLR level. But if this is true, then it seems strange (to me) that the behavior would be defined in the language specifications themselves. Any thoughts?Geisler
@Dan Tao: for boxing, this is true. My answer only covers the operator “overloading” (== null testing, operator lifting etc.).Soft
I
3

I was asking myself the same question and I was also expecting to have some implicit operator for Nullable<T> in .net Nullable source code so I looked at what is the IL code corresponding to int? a = null; to understand what is happening behind the scene:

c# code:

int? a = null;
int? a2 = new int?();
object a3 = null;
int? b = 5;
int? b2 = new int?(5);

IL code (generated with LINQPad 5):

IL_0000:  nop         
IL_0001:  ldloca.s    00 // a
IL_0003:  initobj     System.Nullable<System.Int32>
IL_0009:  ldloca.s    01 // a2
IL_000B:  initobj     System.Nullable<System.Int32>
IL_0011:  ldnull      
IL_0012:  stloc.2     // a3
IL_0013:  ldloca.s    03 // b
IL_0015:  ldc.i4.5    
IL_0016:  call        System.Nullable<System.Int32>..ctor
IL_001B:  ldloca.s    04 // b2
IL_001D:  ldc.i4.5    
IL_001E:  call        System.Nullable<System.Int32>..ctor
IL_0023:  ret   

We see that the compiler change int? a = null to something like int? a = new int?() which is quite different to object a3 = null. So clearly Nullables have a special compiler treatment.

Inferno answered 9/5, 2018 at 13:33 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.