Can someone please explain this lazy evaluation code?
Asked Answered
H

3

6

So, this question was just asked on SO:

How to handle an "infinite" IEnumerable?

My sample code:

public static void Main(string[] args)
{
    foreach (var item in Numbers().Take(10))
        Console.WriteLine(item);
    Console.ReadKey();
}

public static IEnumerable<int> Numbers()
{
    int x = 0;
    while (true)
        yield return x++;
}

Can someone please explain why this is lazy evaluated? I've looked up this code in Reflector, and I'm more confused than when I began.

Reflector outputs:

public static IEnumerable<int> Numbers()
{
    return new <Numbers>d__0(-2);
}

For the numbers method, and looks to have generated a new type for that expression:

[DebuggerHidden]
public <Numbers>d__0(int <>1__state)
{
    this.<>1__state = <>1__state;
    this.<>l__initialThreadId = Thread.CurrentThread.ManagedThreadId;
}

This makes no sense to me. I would have assumed it was an infinite loop until I put that code together and executed it myself.

EDIT: So I understand now that .Take() can tell the foreach that the enumeration has 'ended', when it really hasn't, but shouldn't Numbers() be called in it's entirety before chaining forward to the Take()? The Take result is what is actually being enumerated over, correct? But how is Take executing when Numbers has not fully evaluated?

EDIT2: So is this just a specific compiler trick enforced by the 'yield' keyword?

Haggadist answered 29/4, 2010 at 19:14 Comment(0)
M
1

The reason this isn't an infinite loop is you are only enumerating 10 times according to the use of Linq's Take(10) call. Now if you wrote the code something like:

foreach (var item in Numbers())
{
}

Now this is an infinite loop because your enumerator will always return a new value. C# compiler takes this code and transforms it into a state machine. If your enumerator doesn't have a guard clause to break the execution then the caller must which in your sample it does.

The reason the code is lazy is also a reason why the code works. Essentially Take returns the first item, then your application consumes, then it takes another until it has taken 10 items.

Edit

This actually has nothing to do with the addition of take. These are called Iterators. The C# compiler performs a complicated transformation on your code creating an enumerator out of your method. I recommend reading up on it but basically (And this might not be 100% accurate), your code will enter the Numbers method which you could envision as initilizing the state machine.

Once your code hits a yield return, you are in essence saying Numbers() stop executing give them back this result and then when they ask for the next item resume execution at the next line after the yield return.

Erik Lippert has a great series on misc aspects of Iterators

Manzo answered 29/4, 2010 at 19:22 Comment(2)
Yes, but wouldn't Numbers() itself have to be fully evaluated in order to continue with the Take call? I get that Numbers itself is an infinite loop. Why does the addition of .Take() suddenly stop Numbers from being evaluated in it's entirety?Haggadist
Interesting. Thanks for the link! That question (and the code) made me do a double take, and I now realize I have more to learn about Enumerables!Haggadist
S
2

This has to with:

  • What iEnumerable does when certain methods are called
  • The nature of enumeration and the Yield statement

When you enumerate over any sort of IEnumerable, the class gives you the next item that it's going to give you. It doesn't do something to all its items, it just give you the next item. It decides what that item is going to be. (For example, some collections are ordered, some aren't. Some don't guarantee a particular order, but seem to always give them back in the same order you put them in.).

The IEnumerable extension method Take() will enumerate 10 times, getting the first 10 items. You could do Take(100000000), and it would give you a lot of numbers. But you're just doing Take(10). It just asks Numbers() for the next item . . . 10 times.

Each of those 10 items, Numbers gives the next item. To understand how, you'll need to read up on the Yield statement. It's syntactic sugar for something more complicated. Yield is very powerful. (I'm a VB developer and am very annoyed that I still don't have it.) It's not a function; it's a keyword with certain restrictions. And it makes defining an enumerator a lot easier than it otherwise might be.

Other IEnumerable extension methods always iterate through every single item. Calling .AsList would blow it up. Using it most LINQ queries would blow it up.

Sepal answered 29/4, 2010 at 19:23 Comment(1)
Writting custom enumerators using yield is a lot of fun.Manzo
M
1

The reason this isn't an infinite loop is you are only enumerating 10 times according to the use of Linq's Take(10) call. Now if you wrote the code something like:

foreach (var item in Numbers())
{
}

Now this is an infinite loop because your enumerator will always return a new value. C# compiler takes this code and transforms it into a state machine. If your enumerator doesn't have a guard clause to break the execution then the caller must which in your sample it does.

The reason the code is lazy is also a reason why the code works. Essentially Take returns the first item, then your application consumes, then it takes another until it has taken 10 items.

Edit

This actually has nothing to do with the addition of take. These are called Iterators. The C# compiler performs a complicated transformation on your code creating an enumerator out of your method. I recommend reading up on it but basically (And this might not be 100% accurate), your code will enter the Numbers method which you could envision as initilizing the state machine.

Once your code hits a yield return, you are in essence saying Numbers() stop executing give them back this result and then when they ask for the next item resume execution at the next line after the yield return.

Erik Lippert has a great series on misc aspects of Iterators

Manzo answered 29/4, 2010 at 19:22 Comment(2)
Yes, but wouldn't Numbers() itself have to be fully evaluated in order to continue with the Take call? I get that Numbers itself is an infinite loop. Why does the addition of .Take() suddenly stop Numbers from being evaluated in it's entirety?Haggadist
Interesting. Thanks for the link! That question (and the code) made me do a double take, and I now realize I have more to learn about Enumerables!Haggadist
C
0

Basically, your Numbers() function creates an Enumerator.
The foreach will check, in each iteration, if the enumrator has reached the end, and if not, it will continue. Your praticular enumrator will never end, but that doesn't matter. This is being lazyily-evaluated.
The Enumerator will generate the results "live".
What that means is if you would write .Take(3) in there, the loop will only be executed three times. The enumrator would still have some items "left" in it, but they would not be generated since no method needs them, at this time.
If you would try to generate all the numbers from 0 to infinity like the function implies, and returns them all at once, this program, that only uses 10 of them, would be much, much slower. That's the benefit of lazy evaluation - what's never used is never calculated.

Convery answered 29/4, 2010 at 19:23 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.