how are C# object references represented in memory / at runtime (in the CLR)?
Asked Answered
P

2

13

I'm curious to know how C# object references are represented in memory at runtime (in the .NET CLR). Some questions that come to mind are:

  1. How much memory does an object reference occupy? Does it differ when defined in the scope of a class vs the scope of a method? Does where it live differ based on this scope (stack vs heap)?

  2. What is the actual data maintained within an object reference? Is it simply a memory address that points to the object it refers to or is there more to it? Does this differ based on whether it is defined within the scope of a class or method?

  3. Same questions as above, but this time when talking about a reference to a reference, as in when a object reference is passed to a method by reference. How do the answers to 1 and 2 change?

Perrins answered 29/2, 2012 at 0:20 Comment(5)
Note that these questions are all implementation details (which are subject to change,) and not actually about C#, but rather about the .NET CLR.Hoofbound
Chopperdave, good interesting question but I want to ask if you're asking what you meant - An object reference is largely a pointer and that's just a 'number' depending on the architecture of the system your code is running on. If you're asking about how the .Net Heap allocations work, that's a different beast entirely.Pankhurst
Just wanted to add, no insult intended here, I'm not trying to imply that you don't know what you mean - The thing is, is that in .Net this is an ambiguous question and it will help future users on Stack Overflow to know precisely which context we're talking about.Pankhurst
@RussC None taken. In this case I'm very open to the idea I don't know what I'm asking :)Perrins
@Hoofbound Thanks. I'll edit title + tags to indicate.Perrins
P
12

This answer is most easily understood if you understand C/C++ pointers. A pointer is a simply the memory address of some data.

  1. An object reference should be the size of a pointer, which is normally 4 bytes on a 32-bit CPU, and 8 bytes on a 64-bit CPU. It is the same regardless of where it is defined. Where it lives does depend on where it is defined. If it is a field of a class, it will reside on the heap in the object it is part of. If it is a static field, it is located in a special section of the heap that is not subject to garbage collection. If it is a local variable, it lives on the stack.

  2. An object reference is simply a pointer, which can be visualized as an int or long containing the address of the object in memory. It is the same regardless of where it is defined.

  3. This is implemented as a pointer to a pointer. The data is the same - just a memory address. However, there is no object at the given memory address. Instead, there is another memory address, which is the original reference to the object. This is what allows a reference parameter to be modified. Normally, a parameter disappears when its method completes. Since the reference to the object is not a parameter, then changes to this reference will remain. The reference to a reference will disappear, but not the reference. This is the purpose for passing reference parameters.

One thing you should know, value types are stored in place (there is no memory address, instead they are stored directly where the memory address would be - See #1). When they are passed to a method, a copy is made and that copy is used in the method. When they are passed by reference, a memory address is passed which locates the value type in memory, allowing it to be changed.

Edit: As dlev pointed out, these answers are not the hard and fast rule, since there is no rule that says this is how it must be. .NET is free to implement these questions however it wants. This is the most likely way to implement it though, as this is how the Intel CPU's work internally, so using any other method would likely be inefficient.

Hope I didn't confuse you too much, but feel free to ask if you need clarification.

Prescind answered 29/2, 2012 at 0:42 Comment(8)
Value types aren't always stored on the stack.Ahasuerus
"value types are stored on the stack": that's not true. Value types can be stored on the stack, but it's not always the case. E.g. a value type field in a reference type is stored on the heap, in the object to which it belongs. +1 anyway, because your answer is mostly correct...Melly
@ThomasLevesque: Removed the reference to value types on the stack. Thanks!Prescind
+1 for a comprehensive answer, but as others have said, Value types are the exception to the rule. My gut feeling though, is that whilst this is what the OP asked, it may not be what he meant!Pankhurst
Very clear and comprehensive. Thank you. Luckily a long time ago I took a random course on C++ data types and pointers were commonly referenced (no pun intended).Perrins
A nitpick: local variables don't always live on the stack. Local variables in C# can sometimes live of heap (closed over variable, iterator block), sometimes in a register, and sometimes they can be removed by the JIT compiler completely, so they don't live anywhere.Bristol
@svick: For the sake of keeping this post from becoming a book, I don't want to bother with every little detail. I'm glad you clarified though.Prescind
@Bristol Additionally, locals that are captured by a lambda will exist inside a closure object (which then might live on the heap).Stater
A
15

.NET Heaps and Stacks This is a thorough treatment of how the stack and heap work.

C# and many other heap-using OOP languages in general reference-speak use Handles not Pointers for references in this context (C# is also capable of using Pointers!) Pointer analogies work for some general concepts, but this conceptual model breaks down for questions like this. See Eric Lippert's excellent post on this topic Handles are Not Addresses

It is not appropriate to say a Handle is the size of a pointer. (although it may coincidentally be the same) Handles are aliases for objects, it isn't required they be a formal address to an object.

In this case the CLR happens to use real addresses for the handles: From the above link:

...the CLR actually does implement managed object references as addresses to objects owned by the garbage collector, but that is an implementation detail.

So yes a handle is probably 4 bytes on a 32 bit architecture, and 8 bytes on a 64 byte architecture, but this is not a "for sure", and it is not directly because of pointers. It is worth noting depending on compiler implementation and the address ranges used some types of pointers can be different in size.

With all of this context you can probably model this by a pointer analogy, but it's important to realize Handles are not required to be addresses. The CLR could choose to change this if it wanted to in the future and consumers of the CLR shouldn't know any better.

A final drive of this subtle point:

This is a C# Pointer:

int* myVariable;

This is a C# Handle:

object myVariable;

They are not the same.

You can do things like math on pointers, that you shouldn't do with Handles. If your handle happens to be implemented like a pointer and you use it as if it were a pointer you are misusing the Handle in some ways that could get you in trouble later on.

Ahasuerus answered 29/2, 2012 at 2:21 Comment(1)
That is not a C# pointer because it is illegal to make a pointer to a managed reference type. int* would be an example of a C# pointer type.Unbeknown
P
12

This answer is most easily understood if you understand C/C++ pointers. A pointer is a simply the memory address of some data.

  1. An object reference should be the size of a pointer, which is normally 4 bytes on a 32-bit CPU, and 8 bytes on a 64-bit CPU. It is the same regardless of where it is defined. Where it lives does depend on where it is defined. If it is a field of a class, it will reside on the heap in the object it is part of. If it is a static field, it is located in a special section of the heap that is not subject to garbage collection. If it is a local variable, it lives on the stack.

  2. An object reference is simply a pointer, which can be visualized as an int or long containing the address of the object in memory. It is the same regardless of where it is defined.

  3. This is implemented as a pointer to a pointer. The data is the same - just a memory address. However, there is no object at the given memory address. Instead, there is another memory address, which is the original reference to the object. This is what allows a reference parameter to be modified. Normally, a parameter disappears when its method completes. Since the reference to the object is not a parameter, then changes to this reference will remain. The reference to a reference will disappear, but not the reference. This is the purpose for passing reference parameters.

One thing you should know, value types are stored in place (there is no memory address, instead they are stored directly where the memory address would be - See #1). When they are passed to a method, a copy is made and that copy is used in the method. When they are passed by reference, a memory address is passed which locates the value type in memory, allowing it to be changed.

Edit: As dlev pointed out, these answers are not the hard and fast rule, since there is no rule that says this is how it must be. .NET is free to implement these questions however it wants. This is the most likely way to implement it though, as this is how the Intel CPU's work internally, so using any other method would likely be inefficient.

Hope I didn't confuse you too much, but feel free to ask if you need clarification.

Prescind answered 29/2, 2012 at 0:42 Comment(8)
Value types aren't always stored on the stack.Ahasuerus
"value types are stored on the stack": that's not true. Value types can be stored on the stack, but it's not always the case. E.g. a value type field in a reference type is stored on the heap, in the object to which it belongs. +1 anyway, because your answer is mostly correct...Melly
@ThomasLevesque: Removed the reference to value types on the stack. Thanks!Prescind
+1 for a comprehensive answer, but as others have said, Value types are the exception to the rule. My gut feeling though, is that whilst this is what the OP asked, it may not be what he meant!Pankhurst
Very clear and comprehensive. Thank you. Luckily a long time ago I took a random course on C++ data types and pointers were commonly referenced (no pun intended).Perrins
A nitpick: local variables don't always live on the stack. Local variables in C# can sometimes live of heap (closed over variable, iterator block), sometimes in a register, and sometimes they can be removed by the JIT compiler completely, so they don't live anywhere.Bristol
@svick: For the sake of keeping this post from becoming a book, I don't want to bother with every little detail. I'm glad you clarified though.Prescind
@Bristol Additionally, locals that are captured by a lambda will exist inside a closure object (which then might live on the heap).Stater

© 2022 - 2024 — McMap. All rights reserved.