How is foreach implemented in C#? [duplicate]
Asked Answered
C

2

23

How exactly is foreach implemented in C#?

I imagine a part of it looking like:

var enumerator = TInput.GetEnumerator();
while(enumerator.MoveNext())
{
  // do some stuff here
}

However I'm unsure what's really going on. What methodology is used for returning enumerator.Current for each cycle? Does it return [for each cycle] or does it take an anonymous function or something to execute the body of foreach?

Colliery answered 24/6, 2012 at 16:30 Comment(3)
Essentially, // do some stuff here gets replaced with the inside of the foreach loop "before" compilation. (Or rather, the compiler generates equivalent bytecode.)Scleroprotein
This is how it is implemented referencesource.microsoft.com/#System.Activities/System/…Foreconscious
Not really a duplicate of linked question. Only the title remotely matches, but the body ask quite something else.Hobson
K
31

It doesn't use an anonymous function, no. Basically the compiler converts the code into something broadly equivalent to the while loop you've shown here.

foreach isn't a function call - it's built-into the language itself, just like for loops and while loops. There's no need for it to return anything or "take" a function of any kind.

Note that foreach has a few interesting wrinkles:

  • When iterating over an array (known at compile-time) the compiler can use a loop counter and compare with the length of the array instead of using an IEnumerator
  • foreach will dispose of the iterator at the end; that's simple for IEnumerator<T> which extends IDisposable, but as IEnumerator doesn't, the compiler inserts a check to test at execution time whether the iterator implements IDisposable
  • You can iterate over types which don't implement IEnumerable or IEnumerable<T>, so long as you have an applicable GetEnumerator() method which returns a type with suitable Current and MoveNext() members. As noted in comments, a type can also implement IEnumerable or IEnumerable<T> explicitly, but have a public GetEnumerator() method which returns a type other than IEnumerator/IEnumerator<T>. See List<T>.GetEnumerator() for an example - this avoids creating a reference type object unnecessarily in many cases.

See section 8.8.4 of the C# 4 spec for more information.

Kraska answered 24/6, 2012 at 16:32 Comment(8)
Your third bullet isn't just for types that don't implement IEnumerable: when a type does implement it and also provides a GetEnumerator method that's different from IEnumerable<T>.GetEnumerator, the type's own GetEnumerator will be used. The standard class List<T> is a good example.Analyzer
Thanks @JonSkeet! Out of interest, is there a parameter type that we can use in C# that looks like Type x in collection. Can we use this dialect ourselves?Colliery
@JamieDixon: I don't understand what you're asking, I'm afraid.Kraska
When we use a foreach we use the syntax int x in collection as in foreach(int x in collection). Is it possible to use this syntax as a property of a standard method?Colliery
For fun I've had a go at implementing my own foreach. I've written about it here: jamie-dixon.co.uk/csharp/implementing-a-custom-foreach-iteratorColliery
It's not clear why you haven't just used Action<T>, modelling this on List<T>.ForEach. And why are you using the non-generic IEnumerable? Also, lambda expressions don't need to be nearly as explicit in many cases :)Kraska
Ah of course. I've changed my local code to use Action<T> and negated the types from the lambda. As for the non-generic IEnumerable, that's just an issue with me not checking wordpress before pushing submit. I'm actually using the generic version. I'll fix that now.Colliery
@JonSkeet I think a word on the casting involved would make this answer complete..Omphalos
O
12

Surprised the exact implementation is not touched. While what you have posted in the question is the simplest form, the complete implementation (including enumerator disposal, casting etc) is in the 8.8.4 section of the spec.

Now there are 2 scenarios where a foreach loop can be run on a type:

  1. If the type has a public/non-static/non-generic/parameterless method named GetEnumerator which returns something that has a public MoveNext method and a public Current property. As noted by Mr Eric Lippert in this blog article, this was designed so as to accommodate pre generic era for both type safety and boxing related performance issues in case of value types. Note that this a case of duck typing. For instance this works:

    class Test
    {
        public SomethingEnumerator GetEnumerator()
        {
    
        }
    }
    
    class SomethingEnumerator
    {
        public Something Current //could return anything
        {
            get { return ... }
        }
    
        public bool MoveNext()
        {
    
        }
    }
    
    //now you can call
    foreach (Something thing in new Test()) //type safe
    {
    
    }
    

    This is then translated by the compiler to:

    E enumerator = (collection).GetEnumerator();
    try {
       ElementType element; //pre C# 5
       while (enumerator.MoveNext()) {
          ElementType element; //post C# 5
          element = (ElementType)enumerator.Current;
          statement;
       }
    }
    finally {
       IDisposable disposable = enumerator as System.IDisposable;
       if (disposable != null) disposable.Dispose();
    }
    
  2. If the type implements IEnumerable where theGetEnumerator returns IEnumerator that has a public MoveNext method and a public Current property. But an interesting sub case is that even if you implement IEnumerable explicitly (ie no public GetEnumerator method on Test class), you can have a foreach.

    class Test : IEnumerable
    {
        IEnumerator IEnumerable.GetEnumerator()
        {
    
        }
    }
    

    This is because in this case foreach is implemented as (provided there is no other public GetEnumerator method in the class):

    IEnumerator enumerator = ((IEnumerable)(collection)).GetEnumerator();
    try {
        ElementType element; //pre C# 5
        while (enumerator.MoveNext()) {
            ElementType element; //post C# 5
            element = (ElementType)enumerator.Current;
            statement;
       }
    }
    finally {
        IDisposable disposable = enumerator as System.IDisposable;
        if (disposable != null) disposable.Dispose();
    }
    

    If the type implements IEnumerable<T> explicitly then the foreach is converted to (provided there is no other public GetEnumerator method in the class):

    IEnumerator<T> enumerator = ((IEnumerable<T>)(collection)).GetEnumerator();
    try {
        ElementType element; //pre C# 5
        while (enumerator.MoveNext()) {
            ElementType element; //post C# 5
            element = (ElementType)enumerator.Current; //Current is `T` which is cast
            statement;
       }
    }
    finally {
        enumerator.Dispose(); //Enumerator<T> implements IDisposable
    }
    

Few interesting things to note are:

  1. In both the above cases the Enumerator class should have a public MoveNext method and a public Current property. In other words, if you're implementing IEnumerator interface it has to be implemented implicitly. For eg, foreach wont work for this enumerator:

    public class MyEnumerator : IEnumerator
    {
        void IEnumerator.Reset()
        {
            throw new NotImplementedException();
        }
    
        object IEnumerator.Current
        {
            get { throw new NotImplementedException(); }
        }
    
        bool IEnumerator.MoveNext()
        {
            throw new NotImplementedException();
        }
    }
    

    (Thanks Roy Namir for pointing this out. foreach implementation isnt as easy it seems on the surface)

  2. Enumerator precedence - It goes like if you have a public GetEnumerator method, then that is the default choice of foreach irrespective of who is implementing it. For example:

    class Test : IEnumerable<int>
    {
        public SomethingEnumerator GetEnumerator()
        {
            //this one is called
        }
    
        IEnumerator<int> IEnumerable<int>.GetEnumerator()
        {
    
        }
    }
    

    If you don't have a public implementation (ie only explicit implementation), then precedence goes like IEnumerator<T> > IEnumerator.

  3. There is a cast operator involved in the implementation of foreach where the collection element is cast back to the type (specified in the foreach loop itself). Which means even if you had written the SomethingEnumerator like this:

    class SomethingEnumerator
    {
        public object Current //returns object this time
        {
            get { return ... }
        }
    
        public bool MoveNext()
        {
    
        }
    }
    

    You could write:

    foreach (Something thing in new Test())
    {
    
    }
    

    Because Something is type compatible with object, going by C# rules ,or in other words, the compiler lets it if there is an explicit cast possible between the two types. Otherwise the compiler prevents it. The actual cast is performed at run time which may or may not fail.

Omphalos answered 1/12, 2013 at 10:56 Comment(6)
The article you're thinking of is here: blogs.msdn.com/b/ericlippert/archive/2011/06/30/…Genesisgenet
@EricLippert thanks, I will update it..Omphalos
Of course, if either IEnumerable or IEnumerable<X> for one type X (IEnumerable<Y> could be implemented differently in pathological cases) is implemented implicitly, then surely there is a "valid" public GetEnumerator, so that falls under case 1. above. (We disregard the case where GetEnumerator from a base class is hidden by an identical overload in the relevant type.) But if you implement the interfaces explicitly, you can still have a "bad" public GetEnumerator. For example if you have public void/* bad */ GetEnumerator() { }, the IEnumerable will not be considered.Concubinage
Case 1. is attempted if GetEnumerator is public, non-static, non-generic and takes zero parameters, if I remember correctly. If GetEnumerator is not like that, or doesn't exist, IEnumerable<> or IEnumerable is considered.Concubinage
@JeppeStigNielsen Surely, that falls under case 1. If there is a public GetEnumerator that is all it matters. I did mention it in the answer. As an example, I put a more surprising case, ie, a simple standalone public GetEnumerator which can override the explicit ones even if they are from IEnumerable interfaces. And of course it should be non generic and non static as well, which I will update. Thanks!Omphalos
#31343347Buskirk

© 2022 - 2024 — McMap. All rights reserved.