How do closures work behind the scenes? (C#)

C

4

49

I feel I have a pretty decent understanding of closures, how to use them, and when they can be useful. But what I don't understand is how they actually work behind the scenes in memory. Some example code:

public Action Counter()
{
    int count = 0;
    Action counter = () =>
    {
        count++;
    };

    return counter;
}

Normally, if {count} was not captured by the closure, its lifecycle would be scoped to the Counter() method, and after it completes it would go away with the rest of the stack allocation for Counter(). What happens though when it is closured? Does the whole stack allocation for this call of Counter() stick around? Does it copy {count} to the heap? Does it never actually get allocated on the stack, but recognized by the compiler as being closured and therefore always lives on the heap?

For this particular question, I'm primarily interested in how this works in C#, but would not be opposed to comparisons against other languages that support closures.

Cris answered 18/12, 2009 at 14:47 Comment(6)

Great question. I am not sure, but yes, you can keep the stack frame around in C#. Generators use it all the time (thing LINQ for data structures) which rely on yield under the hood. Hopefully I am not off the mark. if I am, I will learn a great deal. – Jebel 18/12, 2009 at 14:53

yield turns the method into a separate class with a state machine. The stack itself isn't kept around, but the stack state is moved into class state in a compiler-generated class – Nozicka 18/12, 2009 at 14:55

@thecoop, do you have a link explaining this please? – Jebel 18/12, 2009 at 15:1

Sure, read this series if you want to understand how iterators are built: blogs.msdn.com/oldnewthing/archive/2008/08/12/8849519.aspx – Idler 18/12, 2009 at 15:18

You absolutely CANNOT "keep the stack frame around". The stack frame is on the stack! How would we pop the stack if we were keeping it alive? – Idler 18/12, 2009 at 15:19

Jon Skeet has a section about this in "C# in depth" :) (He even answers questions before they are asked now!?) – Hierodule 18/12, 2009 at 15:44

S

35

The compiler (as opposed to the runtime) creates another class/type. The function with your closure and any variables you closed over/hoisted/captured are re-written throughout your code as members of that class. A closure in .Net is implemented as one instance of this hidden class.

That means your count variable is a member of a different class entirely, and the lifetime of that class works like any other clr object; it's not eligible for garbage collection until it's no longer rooted. That means as long as you have a callable reference to the method it's not going anywhere.

Schmid answered 18/12, 2009 at 14:51 Comment(5)

Inspect the code in question with Reflector to see an example of this – Caras 18/12, 2009 at 14:54

...just look for the ugliest named class in your solution. – Ringnecked 18/12, 2009 at 15:3

Does that mean a closure will result in a new heap allocation, even if the value being closured is a primitive? – Cris 11/11, 2010 at 17:14

@Cris - I wouldn't call it 'new', because as far as the resulting code is concerned your primitive was always on the stack. The needed closure is created at the same time as whatever object that will use the closure is created. – Schmid 11/11, 2010 at 17:31

s/always on the stack/always on the heap/ – Schmid 11/11, 2010 at 21:37

I

50

Your third guess is correct. The compiler will generate code like this:

private class Locals
{
  public int count;
  public void Anonymous()
  {
    this.count++;
  }
}

public Action Counter()
{
  Locals locals = new Locals();
  locals.count = 0;
  Action counter = new Action(locals.Anonymous);
  return counter;
}

Make sense?

Also, you asked for comparisons. VB and JScript both create closures in pretty much the same way.

Idler answered 18/12, 2009 at 15:15 Comment(2)

Now that .NET handles ref struct better, will closures now use zero-allocation structs rather than classes for the closure when the compiler can prove the closure's lifetime? – Galateah 7/7, 2020 at 1:52

@Dai: Great question and I do not know the answer. Back when I was at Microsoft -- recall that I left in 2012 -- we had a number of ideas for improving closure lifetimes but I do not know if any of them were implemented. – Idler 8/7, 2020 at 4:11

S

35

The compiler (as opposed to the runtime) creates another class/type. The function with your closure and any variables you closed over/hoisted/captured are re-written throughout your code as members of that class. A closure in .Net is implemented as one instance of this hidden class.

That means your count variable is a member of a different class entirely, and the lifetime of that class works like any other clr object; it's not eligible for garbage collection until it's no longer rooted. That means as long as you have a callable reference to the method it's not going anywhere.

Schmid answered 18/12, 2009 at 14:51 Comment(5)

Inspect the code in question with Reflector to see an example of this – Caras 18/12, 2009 at 14:54

...just look for the ugliest named class in your solution. – Ringnecked 18/12, 2009 at 15:3

Does that mean a closure will result in a new heap allocation, even if the value being closured is a primitive? – Cris 11/11, 2010 at 17:14

@Cris - I wouldn't call it 'new', because as far as the resulting code is concerned your primitive was always on the stack. The needed closure is created at the same time as whatever object that will use the closure is created. – Schmid 11/11, 2010 at 17:31

s/always on the stack/always on the heap/ – Schmid 11/11, 2010 at 21:37

R

0

Thanks @HenkHolterman. Since it was already explained by Eric, I added the link just to show what actual class the compiler generates for closure. I would like to add to that the creation of display classes by C# compiler can lead to memory leaks. For example inside a function there a int variable that is captured by a lambda expression and there another local variable that simply holds a reference to a large byte array. Compiler would create one display class instance which will hold the references to both the variables i.e. int and the byte array. But the byte array will not be garbage collected till the lambda is being referenced.

Rossman answered 18/12, 2009 at 14:47 Comment(0)

G

0

Eric Lippert's answer really hits the point. However it would be nice to build a picture of how stack frames and captures work in general. To do this it helps to look at a slightly more complex example.

Here is the capturing code:

public class Scorekeeper { 
   int swish = 7; 

   public Action Counter(int start)
   {
      int count = 0;
      Action counter = () => { count += start + swish; }
      return counter;
   }
}

And here is what I think the equivalent would be (if we are lucky Eric Lippert will comment on whether this is actually correct or not):

private class Locals
{
  public Locals( Scorekeeper sk, int st)
  { 
      this.scorekeeper = sk;
      this.start = st;
  } 

  private Scorekeeper scorekeeper;
  private int start;

  public int count;

  public void Anonymous()
  {
    this.count += start + scorekeeper.swish;
  }
}

public class Scorekeeper {
    int swish = 7;

    public Action Counter(int start)
    {
      Locals locals = new Locals(this, start);
      locals.count = 0;
      Action counter = new Action(locals.Anonymous);
      return counter;
    }
}

The point is that the local class substitutes for the entire stack frame and is initialized accordingly each time the Counter method is invoked. Typically the stack frame includes a reference to 'this', plus method arguments, plus local variables. (The stack frame is also in effect extended when a control block is entered.)

Consequently we do not have just one object corresponding to the captured context, instead we actually have one object per captured stack frame.

Based on this, we can use the following mental model: stack frames are kept on the heap (instead of on the stack), while the stack itself just contains pointers to the stack frames that are on the heap. Lambda methods contain a pointer to the stack frame. This is done using managed memory, so the frame sticks around on the heap until it is no longer needed.

Obviously the compiler can implement this by only using the heap when the heap object is required to support a lambda closure.

What I like about this model is it provides an integrated picture for 'yield return'. We can think of an iterator method (using yield return) as if it's stack frame were created on the heap and the referencing pointer stored in a local variable in the caller, for use during the iteration.

Gerund answered 25/10, 2017 at 4:26 Comment(5)

It is not correct; how can private swish be accessed from outside class Scorekeeper? What happens if start is mutated? But more to the point: what is the value in answering an eight year old question with an accepted answer? – Idler 25/10, 2017 at 4:35

If you want to know what the real codegen is, use ILDASM or an IL-to-source disassembler. – Idler 25/10, 2017 at 4:37

A better way entirely to think of it is to stop thinking of "stack frames" as something fundamental. The stack is simply a data structure that is used to implement two things: activation and continuation. That is: what are the values associated with the activation of a method, and what code is going to run after this method returns? But the stack is only a suitable data structure for storing activation/continuation information if method activation lifetimes logically form a stack. – Idler 25/10, 2017 at 4:41

Since lambdas, iterator blocks and async all enable method activation lifetimes which do not logically form stacks, the stack cannot be used as a data structure for activations and continuations. So the data structures have to be allocated on the long term pool – Idler 25/10, 2017 at 4:44

Your comments on activations and continuations make sense. Activations usually happen from existing frames, though, so there is kind of an implied ordering to the frames. The fact is that earlier frames can terminate while later frames may continue for a longer time, so there are gaps. In addition I suppose we can also have frames generated by asynchronous hardware events. As for why answer an old question, well, I'm getting value out of what you just posted a minute ago :-). I guess I should have commented on your answer rather than start a new one. – Gerund 25/10, 2017 at 4:53

Recommended topics

Hot tags