When should I separately implement IEnumerator<T>?
Asked Answered
H

2

4

In the framework classes of collections I have often seen IEnumerator<T> separately implemented as an inner class and an instance of it is returned in the GetEnumerator method.

Now suppose I'm writing my own collection classes which will have an inbuilt collection like List<T> or T[] act as the holder internally, say like this:

public class SpecialCollection<T> : IEnumerable<T>
{
    List<T> list;

    public SpecialCollection<T>()
    {

    }

    public IEnumerator<T> GetEnumerator()
    {
        return list.GetEnumerator();

        //or
        return list.Where(x => some logic).GetEnumerator(); 

        //or directly rely on yield keyword
        yield return x; //etc
    }
}

Should I be writing my own enumerator class, or is it ok to return enumerator of the List<T> class? Is there any circumstance under which should I be writing my own enumerator class?


I have a related question as well. If it's not all that important or doesn't make much of a difference, why do every collection class in the BCL write their own IEnumerator?

For eg, List<T> class has something like

T[] items;
public IEnumerator<T> GetEnumerator()
{
    return new List<T>.Enumerator(items);
}
Heyman answered 10/6, 2014 at 13:41 Comment(21)
As opposed to what? Many of those classes were written before iterators existed.Veilleux
Generics were introduced at the same time as the yield keyword, if I'm not mistakenNjord
Don't use the wasteful array.Select(x => x).GetEnumerator();, you can use ((IEnumerable<T>)array).GetEnumerator(). One-dimensional arrays implement the generic interfaces in a "magical" way that looks like explicit interface implementation from the outside.Brood
Many of those enumerators are structs. Perhaps that has something to do with it.Njord
@JeppeStigNielsen I'm not sure I get your point. That was some example.Heyman
@JeppeStigNielsen I think the // some logic was meant that some additional logic can be added to the enumerator.Stoddard
Isn't a duplicate, but may clear up some points: #3738497Scourge
I can understand if this q was a duplicate, or vague. Opinion based? What's wrong?Heyman
See if ericlippert.com/2011/06/30/following-the-pattern helps youBracteate
@Stoddard Suppose I write my own type class MyColl : IEnumerable<int>. I could hold my values in an array internally, so have a field private int[] array;. Now, when I need to implement the generic interface, I can't just say return array.GetEnumerator(); (compile-time error) because of the not-quite-generic nature of arrays (here int[]). I could solve that by return array.Select(x => x).GetEnumerator();. But that is wasteful. Instead use return ((IEnumerable<int>)array).GetEnumerator(); or equivalently return array.AsEnumerable().GetEnumerator();. See what I mean?Brood
@JeppeStigNielsen good point. Just to reiterate, the examples weren't carefully crafted ones. I only wanted to convey the intent.Heyman
@JonSkeet Thanks. But that link talks about what forced the need for duck typing in case of foreach and why struct is preferred over a reference type. It doesn't talk anything about re-using the existing enumerator implementations.Heyman
@Heyman That assumes you have an existing enumerator implementation to reuse. If you do, great, if you don't, then you have to do something else. At this post an iterator block will cover most everyone, that it, everyone but those with the strictest of performance requirements.Rotund
@nawfal: It talks about why List<T> has its own (struct) enumerator type.Bracteate
@Rotund yes, that is what my question is about. That is, if there already exists enumerator implementations to re-use, is there any point in writing new ones. I asked this because I see it all the time in framework classes. But as Eric answers, performance is the only reason.Heyman
@JonSkeet from what I understand, it doesn't talk about why List<T> has its own (struct) enumerator type. It talks about why List<T> has its own (struct) enumerator type.Heyman
@nawfal: I'm not at all sure what difference you're trying to specify there, unless it's in terms of the highlighting. Given that the enumerator needs access to the list's private members, it has to be a nested struct... how would you suggest implementing it otherwise?Bracteate
@JonSkeet The article doesn't talk about not using the GetEnumerator of the backing array (which is what my question is about). It talks about why it is required to be struct in the first place. My q is, if you require a struct, then why not re-use the already implemented struct of backing array. I hope its clear. Not to be nitpicking, but honestly the link doesnt clear my confusion at all. Eric's answer does.Heyman
@Heyman It seems you're missing the fact that a foreach over an array is recognized by the compiler and transformed into a for loop instead, as that's much faster. This means that only boxed array enumerators are actually used, and the act of boxing also eliminates the possibility of the list-style optimizations as well, so there's no need for arrays to use the list's trick of having a struct as an enumerator (so they don't; their enumerator is a class), because they have something even better that's specific to them.Rotund
@nawfal: If you return the IEnumerator<T> of the array, you wouldn't be able to detect the list changing between calls, which should trigger an exception.Bracteate
@JonSkeet That's an excellent point and answers my question!. If you could make it an answer I will appreciate it (vcsjones has answered it, but for some reason he has deleted it). But I also do think utilizing enumerators of anything above array (like List<T>, Queue<T>) in the framework is not a bad choice considering they do catch modifications.Heyman
P
7

The answer to your first question is: when a yield return doesn't meet your needs.

The answer to your second question is: these heavily used types have performance requirements that are unusually strict, so the enumerators are custom built. I've written some articles on this recently; see:

http://ericlippert.com/2014/05/21/enumerator-advance/

http://ericlippert.com/2014/06/04/enumerator-bounds/

Precondition answered 10/6, 2014 at 13:50 Comment(0)
B
2

Just to answer one part:

List<T> has its own enumerator implementation for two reasons:

  • It can't just return the iterator of its backing array, because:
    • It needs to detect structural changes in the list (additions and removals) in order to invalidate the enumerator
    • The array may well be larger than the list. (Using Take would fix this, at the cost of another level of indirection)
  • The above could be performed with an iterator block (although there'd have to be a non-iterator method first, in order to capture the "structural version" of the list at call time rather than first iteration time), but that's relatively inefficient compared with the actual highly-optimized mutable struct implementation

Using a mutable struct here has certain issues, but when used in the expected fashion, it avoids heap allocations, virtual method calls via references etc.

Bracteate answered 10/6, 2014 at 15:45 Comment(4)
The array length is another excellent point, easy to miss. But I do think returning the enumerator of built in types from the framework like dictionary, list, stack, queue, set etc (other than array of course) is ok, considering their enumerator handles this collection modification issue. Isn't it?Heyman
@nawfal: Returning the enumerator in what context? It's often okay, but sometimes it isn't. There's no one-size-fits-all answer here.Bracteate
I meant in the same case as in the example I have provided in my question. If the backing collection type of my class SpecialCollection<T> is List<T>, then returning List<T>.GetEnumerator from the SpecialCollection<T>.GetEnumerator method is error free, I guess.Heyman
@nawfal: Yes, if you don't have any other particular requirements, I think that should be okay.Bracteate

© 2022 - 2024 — McMap. All rights reserved.