How do ValueTypes derive from Object (ReferenceType) and still be ValueTypes?
Asked Answered
B

6

94

C# doesn't allow structs to derive from classes, but all ValueTypes derive from Object. Where is this distinction made?

How does the CLR handle this?

Brayer answered 5/11, 2009 at 17:31 Comment(2)
Outcome of black magic of System.ValueType type in CLR type system.Valle
It seems that the C# type system is ill-defined on its value types. In a theoretical viewpoint, value types cannot derive from reference types. It simply does not make sense. Nevertheless, it works in practice. So, it is like some hack and magic. In designing C#, Microsoft has simply decided to give up some theoretical correctness in exchange for greater convenience and easiness of use (so that all types in C# can be treated as objects).Abduct
T
123

C# doesn't allow structs to derive from classes

Your statement is incorrect, hence your confusion. C# does allow structs to derive from classes. All structs derive from the same class, System.ValueType, which derives from System.Object. And all enums derive from System.Enum.

UPDATE: There has been some confusion in some (now deleted) comments, which warrants clarification. I'll ask some additional questions:

Do structs derive from a base type?

Plainly yes. We can see this by reading the first page of the specification:

All C# types, including primitive types such as int and double, inherit from a single root object type.

Now, I note that the specification overstates the case here. Pointer types do not derive from object, and the derivation relationship for interface types and type parameter types is more complex than this sketch indicates. However, plainly it is the case that all struct types derive from a base type.

Are there other ways that we know that struct types derive from a base type?

Sure. A struct type can override ToString. What is it overriding, if not a virtual method of its base type? Therefore it must have a base type. That base type is a class.

May I derive a user-defined struct from a class of my choice?

Plainly no. This does not imply that structs do not derive from a class. Structs derive from a class, and thereby inherit the heritable members of that class. In fact, structs are required to derive from a specific class: Enums are required to derive from Enum, structs are required to derive from ValueType. Because these are required, the C# language forbids you from stating the derivation relationship in code.

Why forbid it?

When a relationship is required, the language designer has options: (1) require the user to type the required incantation, (2) make it optional, or (3) forbid it. Each has pros and cons, and the C# language designers have chosen differently depending on the specific details of each.

For example, const fields are required to be static, but it is forbidden to say that they are because doing so is first, pointless verbiage, and second, implies that there are non-static const fields. But overloaded operators are required to be marked as static, even though the developer has no choice; it is too easy for developers to believe that an operator overload is an instance method otherwise. This overrides the concern that a user may come to believe that the "static" implies that, say "virtual" is also a possibility.

In this case, requiring a user to say that their struct derives from ValueType seems like mere excess verbiage, and it implies that the struct could derive from another type. To eliminate both these problems, C# makes it illegal to state in the code that a struct derives from a base type, though plainly it does.

Similarly all delegate types derive from MulticastDelegate, but C# requires you to not say that.

So, now we have established that all structs in C# derive from a class.

What is the relationship between inheritance and derivation from a class?

Many people are confused by the inheritance relationship in C#. The inheritance relationship is quite straightforward: if a struct, class or delegate type D derives from a class type B then the heritable members of B are also members of D. It's as simple as that.

What does it mean with regards to inheritance when we say that a struct derives from ValueType? Simply that all the heritable members of ValueType are also members of the struct. This is how structs obtain their implementation of ToString, for example; it is inherited from the base class of the struct.

All heritable members? Surely not. Are private members heritable?

Yes. All private members of a base class are also members of the derived type. It is illegal to call those members by name of course if the call site is not in the accessibility domain of the member. Just because you have a member does not mean you can use it!

We now continue with the original answer:


How does the CLR handle this?

Extremely well. :-)

What makes a value type a value type is that its instances are copied by value. What makes a reference type a reference type is that its instances are copied by reference. You seem to have some belief that the inheritance relationship between value types and reference types is somehow special and unusual, but I don't understand what that belief is. Inheritance has nothing to do with how things are copied.

Look at it this way. Suppose I told you the following facts:

  • There are two kinds of boxes, red boxes and blue boxes.

  • Every red box is empty.

  • There are three special blue boxes called O, V and E.

  • O is not inside any box.

  • V is inside O.

  • E is inside V.

  • No other blue box is inside V.

  • No blue box is inside E.

  • Every red box is in either V or E.

  • Every blue box other than O is itself inside a blue box.

The blue boxes are reference types, the red boxes are value types, O is System.Object, V is System.ValueType, E is System.Enum, and the "inside" relationship is "derives from".

That's a perfectly consistent and straightforward set of rules which you could easily implement yourself, if you had a lot of cardboard and a lot of patience. Whether a box is red or blue has nothing to do with what it's inside; in the real world it is perfectly possible to put a red box inside a blue box. In the CLR, it is perfectly legal to make a value type that inherits from a reference type, so long as it is either System.ValueType or System.Enum.

So let's rephrase your question:

How do ValueTypes derive from Object (ReferenceType) and still be ValueTypes?

as

How is it possible that every red box (value types) is inside (derives from) box O (System.Object), which is a blue box (a reference Type) and still be a red box (a value type)?

When you phrase it like that, I hope it's obvious. There's nothing stopping you from putting a red box inside box V, which is inside box O, which is blue. Why would there be?


AN ADDITIONAL UPDATE:

Joan's original question was about how it is possible that a value type derives from a reference type. My original answer did not really explain any of the mechanisms that the CLR uses to account for the fact that we have a derivation relationship between two things that have completely different representations -- namely, whether the referred-to data has an object header, a sync block, whether it owns its own storage for the purposes of garbage collection, and so on. These mechanisms are complicated, too complicated to explain in one answer. The rules of the CLR type system are quite a bit more complex than the somewhat simplified flavour of it that we see in C#, where there is not a strong distinction made between the boxed and unboxed versions of a type, for example. The introduction of generics also caused a great deal of additional complexity to be added to the CLR. Consult the CLI specification for details, paying particular attention to the rules for boxing and constrained virtual calls.

Tied answered 5/11, 2009 at 18:30 Comment(14)
Thanks Eric. It makes sense now. Do you know why this functionality for structs to be able to derive from classes, etc isn't allowed for C# programmers?Brayer
Language constructs should be meaningful. What would it mean to have an arbitrary value type derived from an arbitrary reference type? Is there anything you could accomplish with such a scheme that you could not also accomplish with user-defined implicit conversions?Tied
I guess not. I just thought you could have some members available to many valuetypes that you see as a group, which you could do using an abstract class to derive the struct. I guess you could use implicit conversions but then you would pay performance penalty, right? If you are doing millions of them.Brayer
Ah, I see. You want to use inheritance not as a mechanism for modeling "is a kind of" relationships, but rather simply as a mechanism for sharing code between a bunch of related types. That seems like a reasonable scenario, though personally I try to avoid using inheritance purely as a code-sharing convenience.Tied
Thanks Eric. You are right. This isn't a deal breaker for me. And I am particularly careful not to overcomplicate structs because of performance reasons.Brayer
Joan to define the behavior once, you can create an interface, have the structs you want to share behavior implement the interface then create an extension method operating on the interface. One potential issue with this approach is when calling the interface methods the struct will be boxed first and the copied boxed value will be passed to the extension method. Any change in state will happen on the copy of the object which may be unintuitive to users of the API.Remex
So does it mean that Every value type that derives from object, is ACTUALLY contained within object?Classieclassification
Hi, Eric! I am really curious about what you think about @MulleDK19 answer.Zaid
@Sipo: I think it is well-intentioned but it answers a question about the CLI type system when the question was asked about the C# type system, and is therefore misleading and confusing.Tied
@Sipo: Now, to be fair, the question includes "how does the CLR handle this?" and the answer does a good job of describing how the CLR implements these rules. But here's the thing: we should expect that the system that implements a language does not have the same rules as the language! Implementation systems are necessarily lower-level, but let's not confuse the rules of that lower-level system with the rules of the high-level system built on it. Sure, the CLR type system makes a distinction between boxed and unboxed value types, as I noted in my answer. But C# does not.Tied
@EricLippert Life would have been far easy if a ValueType was NOT derived from Object, then people would have not confused that there is boxing behind the scene when they call Object class methods which are not overridden by ValueType, maybe done since .Net1.0 had no generics...Ebonize
@Embedd_Khurja: Keep that in mind the next time you design a type system then!Tied
See Mulle's answer for more details about boxing, and the CLR implementation of value types.Ruggles
"Life would have been far easy if a ValueType was NOT derived from Object" Then it would be impossible to have void acceptStruct(object boxed); That wouldn't make life easier at all... it would be a compile-time errorInmesh
F
23

This is a somewhat artificial construct maintained by the CLR in order to allow all types to be treated as a System.Object.

Value types derive from System.Object through System.ValueType, which is where the special handling occurs (ie: the CLR handles boxing/unboxing, etc for any type deriving from ValueType).

Foliar answered 5/11, 2009 at 17:35 Comment(0)
D
22

Small correction, C# doesn't allow structs to custom derive from anything, not just classes. All a struct can do is implement an interface which is very different from derivation.

I think the best way to answer this is that ValueType is special. It is essentially the base class for all value types in the CLR type system. It's hard to know how to answer "how does the CLR handles this" because it's simply a rule of the CLR.

Digged answered 5/11, 2009 at 17:36 Comment(6)
+1 for the good point about structs not deriving from anything [except implicitly deriving from System.ValueType].Foliar
You say that ValueType is special, but it's worth mentioning explicitly that ValueType itself is actually a reference type.Rumpf
If internally it's possible for structs to derive from a class, why don't they expose it for everyone?Brayer
@Joan, the problem is that it's not. The CLR simply doesn't allow this to happen and C# propagates this restrictionDigged
@Joan: They don't, really. This is just so that you can cast a struct to an object, and there for utility. But technically, when compared to how classes are implemented, value types are handled completely differently by the CLR.Foliar
@JoanVenge I believe the confusion here is saying that structs derive from the ValueType class within the CLR. I believe it's more correct to say that within the CLR, structs don't really exist, the implementation of "struct" within the CLR is actually the ValueType class. So it's not like a struct is inheriting from ValueType in the CLR.Rowland
B
6

A boxed value type is effectively a reference type (it walks like one and quacks like one, so effectively it is one). I would suggest that ValueType isn't really the base type of value types, but rather is the base reference type to which value types can be converted when cast to type Object. Non-boxed value types themselves are outside the object hierarchy.

Billmyre answered 23/1, 2011 at 14:51 Comment(2)
I think you mean, "ValueType isn't really the base type of value types"Rowland
@wired_in: Thanks. Corrected.Billmyre
H
6

Your statement is incorrect, hence your confusion. C# does allow structs to derive from classes. All structs derive from the same class, System.ValueType

So let's try this:

 struct MyStruct :  System.ValueType
 {
 }

This will not even compile. Compiler will remind you "Type 'System.ValueType' in interface list is not an interface".

When decompile Int32 which is a struct, you will find :

public struct Int32 : IComparable, IFormattable, IConvertible {}, not mentionning it is derived from System.ValueType. But in object browser, you do find Int32 does inherit from System.ValueType.

So all these lead me to believe:

I think the best way to answer this is that ValueType is special. It is essentially the base class for all value types in the CLR type system. It's hard to know how to answer "how does the CLR handles this" because it's simply a rule of the CLR.

Hyacinthe answered 20/1, 2015 at 19:59 Comment(2)
The same data structures are used in .NET to describe the contents of value types and reference types, but when the CLR sees a type definition which is defined as deriving from ValueType, it uses that to define two kinds of objects: a heap object type that behaves like a reference type, and a storage location type which is effectively outside the type-inheritance system. Because those two kinds of things are used in mutually-exclusive contexts, the same type descriptors can be used for both. At the CLR level, a struct is defined as class whose parent is System.ValueType, but C#...Billmyre
...forbids specifying that structs inherit from anything because there's only one thing they can inherit from (System.ValueType), and forbids classes from specifying that they inherit from System.ValueType because any class that was declared that way would behave like a value type.Billmyre
L
5

Rationale

Of all the answers, @supercat's answer comes closest to the actual answer. Since the other answers don't really answer the question, and downright make incorrect claims (for example that value types inherit from anything), I decided to answer the question.

 

Prologue

This answer is based on my own reverse engineering and the CLI specification.

struct and class are C# keywords. As far as the CLI is concerned, all types (classes, interfaces, structs, etc.) are defined by class definitions.

For example, an object type (Known in C# as class) is defined as follows:

.class MyClass
{
}

 

An interface is defined by a class definition with the interface semantic attribute:

.class interface MyInterface
{
}

 

What about value types?

The reason that structs can inherit from System.ValueType and still be value types, is because.. they don't.

Value types are simple data structures. Value types do not inherit from anything and they cannot implement interfaces. Value types are not subtypes of any type, and they do not have any type information. Given a memory address of a value type, it's not possible to identify what the value type represents, unlike a reference type which has type information in a hidden field.

If we imagine the following C# struct:

namespace MyNamespace
{
    struct MyValueType : ICloneable
    {
        public int A;
        public int B;
        public int C;

        public object Clone()
        {
            // body omitted
        }
    }
}

The following is the IL class definition of that struct:

.class MyNamespace.MyValueType extends [mscorlib]System.ValueType implements [mscorlib]System.ICloneable
{
    .field public int32 A;
    .field public int32 B;
    .field public int32 C;

    .method public final hidebysig newslot virtual instance object Clone() cil managed
    {
        // body omitted
    }
}

So what's going on here? It clearly extends System.ValueType, which is an object/reference type, and implements System.ICloneable.

The explanation is, that when a class definition extends System.ValueType it actually defines 2 things: A value type, and the value type's corresponding boxed type. The members of the class definition define the representation for both the value type and the corresponding boxed type. It is not the value type that extends and implements, it's the corresponding boxed type that does. The extends and implements keywords only apply to the boxed type.

To clarify, the class definition above does 2 things:

  1. Defines a value type with 3 fields (And one method). It does not inherit from anything, and it does not implement any interfaces (value types can do neither).
  2. Defines an object type (the boxed type) with 3 fields (And implementing one interface method), inheriting from System.ValueType, and implementing the System.ICloneable interface.

Note also, that any class definition extending System.ValueType is also intrinsically sealed, whether the sealed keyword is specified or not.

Since value types are just simple structures, don't inherit, don't implement and don't support polymorphism, they can't be used with the rest of the type system. To work around this, on top of the value type, the CLR also defines a corresponding reference type with the same fields, known as the boxed type. So while a value type can't be passed around to methods taking an object, its corresponding boxed type can.

 

Now, if you were to define a method in C# like

public static void BlaBla(MyNamespace.MyValueType x),

you know that the method will take the value type MyNamespace.MyValueType.

Above, we learned that the class definition that results from the struct keyword in C# actually defines both a value type and an object type. We can only refer to the defined value type, though. Even though the CLI specification states that the constraint keyword boxed can be used to refer to a boxed version of a type, this keyword doesn't exist (See ECMA-335, II.13.1 Referencing value types). But lets imagine that it does for a moment.

When refering to types in IL, a couple of constraints are supported, among which are class and valuetype. If we use valuetype MyNamespace.MyType we're specifying the value type class definition called MyNamespace.MyType. Likewise, we can use class MyNamespace.MyType to specify the object type class definition called MyNamespace.MyType. Which means that in IL you can have a value type (struct) and an object type (class) with the same name and still distinguish them. Now, if the boxed keyword noted by the CLI specification was actually implemented, we'd be able to use boxed MyNamespace.MyType to specify the boxed type of the value type class definition called MyNamespace.MyType.

So, .method static void Print(valuetype MyNamespace.MyType test) cil managed takes the value type defined by a value type class definition named MyNamespace.MyType,

while .method static void Print(class MyNamespace.MyType test) cil managed takes the object type defined by the object type class definition named MyNamespace.MyType.

likewise if boxed was a keyword, .method static void Print(boxed MyNamespace.MyType test) cil managed would take the boxed type of the value type defined by a class definition named MyNamespace.MyType.

You'd then be able to instantiate the boxed type like any other object type and pass it around to any method that takes a System.ValueType, object or boxed MyNamespace.MyValueType as an argument, and it would, for all intents and purposes, work like any other reference type. It is NOT a value type, but the corresponding boxed type of a value type.

 

Summary

So, in summary, and to answer the question:

Value types are not reference types and do not inherit from System.ValueType or any other type, and they cannot implement interfaces. The corresponding boxed types that are also defined do inherit from System.ValueType and can implement interfaces.

A .class definition defines different things depending on circumstance.

  • If the interface semantic attribute is specified, the class definition defines an interface.
  • If the interface semantic attribute is not specified, and the definition does not extend System.ValueType, the class definition defines an object type (class).
  • If the interface semantic attribute is not specified, and the definition does extend System.ValueType, the class definition defines a value type and its corresponding boxed type (struct).

Memory layout

This section assumes a 32-bit process

As already mentioned, value types do not have type information, and thus it's not possible to identify what a value type represents from its memory location. A struct describes a simple data type, and contains just the fields it defines:

public struct MyStruct
{
    public int A;
    public short B;
    public int C;
}

If we imagine that an instance of MyStruct was allocated at address 0x1000, then this is the memory layout:

0x1000: int A;
0x1004: short B;
0x1006: 2 byte padding
0x1008: int C;

Structs default to sequential layout. Fields are aligned on boundaries of their own size. Padding is added to satisfy this.

 

If we define a class in the exact same way, as:

public class MyClass
{
    public int A;
    public short B;
    public int C;
}

Imagining the same address, the memory layout is as follows:

0x1000: Pointer to object header
0x1004: int A;
0x1008: int C;
0x100C: short B;
0x100E: 2 byte padding
0x1010: 4 bytes extra

Classes default to automatic layout, and the JIT compiler will arrange them in the most optimal order. Fields are aligned on boundaries of their own size. Padding is added to satisfy this. I'm not sure why, but every class always has an additional 4 bytes at the end.

Offset 0 contains the address of the object header, which contains type information, the virtual method table, etc. This allows the runtime to identify what the data at an address represents, unlike value types.

Thus, value types do not support inheritance, interfaces nor polymorphism.

Methods

Value types do not have virtual method tables, and thus do not support polymorphism. However, their corresponding boxed type does.

When you have an instance of a struct and attempt to call a virtual method like ToString() defined on System.Object, the runtime has to box the struct.

MyStruct myStruct = new MyStruct();
Console.WriteLine(myStruct.ToString()); // ToString() call causes boxing of MyStruct.

However, if the struct overrides ToString() then the call will be statically bound and the runtime will call MyStruct.ToString() without boxing and without looking in any virtual method tables (structs don't have any). For this reason, it's also able to inline the ToString() call.

If the struct overrides ToString() and is boxed, then the call will be resolved using the virtual method table.

System.ValueType myStruct = new MyStruct(); // Creates a new instance of the boxed type of MyStruct.
Console.WriteLine(myStruct.ToString()); // ToString() is now called through the virtual method table.

However, remember that ToString() is defined in the struct, and thus operates on the struct value, so it expects a value type. The boxed type, like any other class, has an object header. If the ToString() method defined on the struct was called directly with the boxed type in the this pointer, when trying to access field A in MyStruct, it would access offset 0, which in the boxed type would be the object header pointer. So the boxed type has a hidden method that does the actual overriding of ToString(). This hidden method unboxes (address calculation only, like the unbox IL instruction) the boxed type then statically calls the ToString() defined on the struct.

Likewise, the boxed type has a hidden method for each implemented interface method that does the same unboxing then statically calls the method defined in the struct.

 

CLI specification

Boxing

I.8.2.4 For every value type, the CTS defines a corresponding reference type called the boxed type. The reverse is not true: In general, reference types do not have a corresponding value type. The representation of a value of a boxed type (a boxed value) is a location where a value of the value type can be stored. A boxed type is an object type and a boxed value is an object.

Defining value types

I.8.9.7 Not all types defined by a class definition are object types (see §I.8.2.3); in particular, value types are not object types, but they are defined using a class definition. A class definition for a value type defines both the (unboxed) value type and the associated boxed type (see §I.8.2.4). The members of the class definition define the representation of both.

II.10.1.3 The type semantic attributes specify whether an interface, class, or value type shall be defined. The interface attribute specifies an interface. If this attribute is not present and the definition extends (directly or indirectly) System.ValueType, and the definition is not for System.Enum, a value type shall be defined (§II.13). Otherwise, a class shall be defined (§II.11).

Value types do not inherit

I.8.9.10 In their unboxed form value types do not inherit from any type. Boxed value types shall inherit directly from System.ValueType unless they are enumerations, in which case, they shall inherit from System.Enum. Boxed value types shall be sealed.

II.13 Unboxed value types are not considered subtypes of another type and it is not valid to use the isinst instruction (see Partition III) on unboxed value types. The isinst instruction can be used for boxed value types, however.

I.8.9.10 A value type does not inherit; rather the base type specified in the class definition defines the base type of the boxed type.

Value types do not implement interfaces

I.8.9.7 Value types do not support interface contracts, but their associated boxed types do.

II.13 Value types shall implement zero or more interfaces, but this has meaning only in their boxed form (§II.13.3).

I.8.2.4 Interfaces and inheritance are defined only on reference types. Thus, while a value type definition (§I.8.9.7) can specify both interfaces that shall be implemented by the value type and the class (System.ValueType or System.Enum) from which it inherits, these apply only to boxed values.

The non-existent boxed keyword

II.13.1 The unboxed form of a value type shall be referred to by using the valuetype keyword followed by a type reference. The boxed form of a value type shall be referred to by using the boxed keyword followed by a type reference.

Note: The specification is wrong here, there is no boxed keyword.

Epilogue

I think part of the confusion of how value types seem to inherit, stems from the fact that C# uses casting syntax to perform boxing and unboxing, which makes it seem like you're performing casts, which is not really the case (although, the CLR will throw an InvalidCastException if attempting to unbox the wrong type). (object)myStruct in C# creates a new instance of the boxed type of the value type; it does not perform any casts. Likewise, (MyStruct)obj in C# unboxes a boxed type, copying the value part out; it does not perform any casts.

Loehr answered 28/10, 2018 at 5:53 Comment(4)
Finally, an answer which clearly describes how it works! This one deserves to be the accepted answer. Good job!Oreilly
" How does the CLR handles this ? " What is "this" , you can't answer this question because its comparing how C# handles its Type system versus how CLR handles its type system. So how CLR handles a C# rule is a ill-posed question. It doesn't handle it, Its the C# compiler that forbids it on its type system, but the CLR doesn't because it doesn't have a concept of classes or structs, its type system has another system of rules. CLR doesn't have structs, it only has Types and a Type can inherit from a ValueType (effectively making it a struct from C# POV, just try ILSPY).Inmesh
"Since the other answers don't really answer the question" That's because the question is ill-posed and makes wrong assumptions. "C# doesn't allow structs to derive from classes, but all ValueTypes derive from Object. Where is this distinction made?" The distinction isn't made anywhere. C# has some rules, CIL has a different set of rules.Inmesh
You didn't answer the question either, but knowing the rules of both systems and comparing them is useful though.Inmesh

© 2022 - 2024 — McMap. All rights reserved.