Let's rewind a little bit: the yield
keyword is translated as many others said to a state machine.
Actually this is not exactly like using a built-in implementation that would be used behind the scenes but rather the compiler rewriting the yield
related code to a state machine by implementing of one the relevant interfaces (the return type of the method containing the yield
keywords).
A (finite) state machine is just a piece of code that depending on where you are in the code (depending on the previous state, input) goes to another state action, and this is pretty much what is happening when you are using and yield with method return type of IEnumerator<T>
/ IEnumerator
. The yield
keyword is what going to create another action to move to the next state from the previous one, hence the state management is created in the MoveNext()
implementation.
This is what exactly the C# compiler / Roslyn is going to do: check the presence of a yield
keyword plus the kind of return type of the containing method, whether it's a IEnumerator<T>
, IEnumerable<T>
, IEnumerator
or IEnumerable
and then create a private class reflecting that method, integrating necessary variables and states.
If you are interested in the details of how the state machine and how the iterations are rewrited by by the compiler, you can check those links out on Github:
Trivia 1: the AsyncRewriter
(used when you write async
/await
code also inherits from StateMachineRewriter
since it also leverages a state machine behind.
As mentioned, the state machine is heavily reflected in the bool MoveNext()
generated implementation in which there is a switch
+ sometimes some old fashioned goto
based on a state field which represents the different paths of execution to different states in your method.
The code that is generated by the compiler from the user-code does not look that "good", mostly cause the compiler adds some weird prefixes and suffixes here and there
For example, the code:
public class TestClass
{
private int _iAmAHere = 0;
public IEnumerator<int> DoSomething()
{
var start = 1;
var stop = 42;
var breakCondition = 34;
var exceptionCondition = 41;
var multiplier = 2;
// Rest of the code... with some yield keywords somewhere below...
The variables and types related to that piece of code above will after compilation look like:
public class TestClass
{
[CompilerGenerated]
private sealed class <DoSomething>d__1 : IEnumerator<int>, IDisposable, IEnumerator
{
// Always present
private int <>1__state;
private int <>2__current;
// Containing class
public TestClass <>4__this;
private int <start>5__1;
private int <stop>5__2;
private int <breakCondition>5__3;
private int <exceptionCondition>5__4;
private int <multiplier>5__5;
Regarding the state machine itself, let's take a look at a very simple example with a dummy branching for yielding some even / odd stuff.
public class Example
{
public IEnumerator<string> DoSomething()
{
const int start = 1;
const int stop = 42;
for (var index = start; index < stop; index++)
{
yield return index % 2 == 0 ? "even" : "odd";
}
}
}
Will be translated in the MoveNext
as:
private bool MoveNext()
{
switch (<>1__state)
{
default:
return false;
case 0:
<>1__state = -1;
<start>5__1 = 1;
<stop>5__2 = 42;
<index>5__3 = <start>5__1;
break;
case 1:
<>1__state = -1;
goto IL_0094;
case 2:
{
<>1__state = -1;
goto IL_0094;
}
IL_0094:
<index>5__3++;
break;
}
if (<index>5__3 < <stop>5__2)
{
if (<index>5__3 % 2 == 0)
{
<>2__current = "even";
<>1__state = 1;
return true;
}
<>2__current = "odd";
<>1__state = 2;
return true;
}
return false;
}
As you can see this implementation is far from being straightforward but it does the job!
Trivia 2: What happens with the IEnumerable
/ IEnumerable<T>
method return type?
Well, instead of just generating a class implementing the IEnumerator<T>
, it will, generate a class that implement both IEnumerable<T>
as well as the IEnumerator<T>
so that the implementation of IEnumerator<T> GetEnumerator()
will leverage the same generated class.
Warm reminder about the few interfaces that are implemented automatically when used a yield
keyword:
public interface IEnumerable<out T> : IEnumerable
{
new IEnumerator<T> GetEnumerator();
}
public interface IEnumerator<out T> : IDisposable, IEnumerator
{
T Current { get; }
}
public interface IEnumerator
{
bool MoveNext();
object Current { get; }
void Reset();
}
You can also check out this example with different paths / branching and the full implementation by the compiler rewriting.
This has been created with SharpLab, you can play with that tool to try different yield
related execution paths and see how the compiler will rewrite them as a state machine in the MoveNext
implementation.
About the second part of the question, ie, yield break
, it has been answered here
It specifies that an iterator has come to an end. You can think of
yield break as a return statement which does not return a value.