C# "Generator" Method
Asked Answered
R

3

3

I come from the world of Python and am trying to create a "generator" method in C#. I'm parsing a file in chunks of a specific buffer size, and only want to read and store the next chunk one at a time and yield it in a foreach loop. Here's what I have so far (simplified proof of concept):

class Page
{
    public uint StartOffset { get; set; }
    private uint currentOffset = 0;

    public Page(MyClass c, uint pageNumber)
    {
        uint StartOffset = pageNumber * c.myPageSize;

        if (StartOffset < c.myLength)
            currentOffset = StartOffset;
        else
            throw new ArgumentOutOfRangeException("Page offset exceeds end of file");

        while (currentOffset < c.myLength && currentOffset < (StartOffset + c.myPageSize))
            // read data from page and populate members (not shown for MWE purposes)
            . . .
    }
}

class MyClass
{
    public uint myLength { get; set; }
    public uint myPageSize { get; set; }

    public IEnumerator<Page> GetEnumerator()
    {
        for (uint i = 1; i < this.myLength; i++)
        {
            // start count at 1 to skip first page
            Page p = new Page(this, i);
            try
            {
                yield return p;
            }
            catch (ArgumentOutOfRangeException)
            {
                // end of available pages, how to signal calling foreach loop?
            }
        }
    }
}

I know this is not perfect since it is a minimum working example (I don't allow many of these properties to be set publicly, but for keeping this simple I don't want to type private members and properties).

However, my main question is how do I let the caller looping over MyClass with a foreach statement know that there are no more items left to loop through? Is there an exception I throw to indicate there are no elements left?

Rhythmics answered 8/7, 2016 at 20:14 Comment(11)
You simply stop yielding items, just like in Python. That being said, you should make a method that returns an IEnumerable<BTreePage>; enumerables are easier to consume.Forepeak
IEnumerator<T>.MoveNext is what tells the caller to stop iterating. This is implemented for you when you use yield return. If you wish to explicitly stop you can use yield break.Durbar
@Forepeak the inconsistency is my fault in the example. Page is a made-up thing for this post, BTreePage is really what I'm returning in my real code. Fixed.Rhythmics
I was commenting about using IEnumerable<T> vs. IEnumerator<T>.Forepeak
@Forepeak I missed the distinction, got a link to explain more or show an example?Rhythmics
@Rhythmics See this question. Basically (since you’re coming from Python), IEnumerable is the generator, or the list, and IEnumerator would be the thing you get when you call iter() on it (the technical thing). Most of the time, you would want the IEnumerable since that’s easier to consume (e.g. using a foreach loop).Forepeak
@Forepeak its confusing, because if i implement it in the class, it requires two getenumerator methods, one generic <> and one not. makes no sense but errors if I don't have bothRhythmics
with accepted answer can't i just do foreach (Page p in myClass)?Rhythmics
No, don’t implement it. Just make your method return that type. I.e. instead of IEnumerator<Page> GetEnumerator(), change the signature to IEnumerable<Page> GetPages() (use a proper name too). The implementation is the same.Forepeak
No, for the foreach to work, you need to return IEnumerable instead of IEnumerator.Forepeak
@Forepeak crap. I gotta figure this out. But that's another question I supposeRhythmics
F
5

As mentioned in the comments, you should use IEnumerable<T> instead of IEnumerator<T>. The enumerator is the technical object that is being used to enumerate over something. That something—in many cases–is an enumerable.

C# has special abilities to deal with enumerables. Most prominently, you can use a foreach loop with an enumerable (but not an enumerator; even though the loop actually uses the enumerator of the enumerable). Also, enumerables allow you to use LINQ which makes it even more easier to consume.

So you should change your class like this:

class MyClass
{
    public uint myLength { get; set; }
    public uint myPageSize { get; set; }

    # note the modified signature
    public IEnumerable<Page> GetPages()
    {
        for (uint i = 1; i < this.myLength; i++)
        {
            Page p;
            try
            {
                p = new Page(this, i);
            }
            catch (ArgumentOutOfRangeException)
            {
                yield break;
            }
            yield return p;
        }
    }
}

In the end, this allows you to use it like this:

var obj = new MyClass();

foreach (var page in obj.GetPages())
{
    // do whatever
}

// or even using LINQ
var pageOffsets = obj.GetPages().Select(p => p.currentOffset).ToList();

Of course, you should also change the name of the method to something meaningful. If you’re returning pages, GetPages is maybe a good first step in the right direction. The name GetEnumerator is kind of reserved for types implementing IEnumerable, where the GetEnumerator method is supposed to return an enumerator of the collection the object represents.

Forepeak answered 8/7, 2016 at 20:48 Comment(1)
Makes much more sense. Thank you!Rhythmics
B
1

The two ways to do it is let the code execution reach the end of the GetEnumerator function or put in a yield break; in the code, this would behave the same as a return; in a function that returned void.

From the caller's perceptive the Enumerator returned from GetEnumerator() will start returning false for MoveNext(), that is how they tell that the enumerator is done.


To fix your "Can't yield a value inside the body of a try block with a catch clause" you put the try/catch around the wrong part of the code, the execption will be thrown on the new not the yield return. Your code should look like

public IEnumerator<Page> GetEnumerator()
{
    for (uint i = 1; i < this.myLength; i++)
    {
        // start count at 1 to skip first page
        Page p;
        try
        {
            p = new Page(this, i);
        }
        catch (ArgumentOutOfRangeException)
        {
            yield break;
        }
        yield return p;
    }
}
Bantustan answered 8/7, 2016 at 20:19 Comment(6)
this is right answer, but now I have two problems ;) -- apparently I can't use yield inside a try statement. ArgRhythmics
Odd. I get "Can't yield a value inside the body of a try block with a catch clause"Rhythmics
Yes, you can’t do that. Just capture the value in a variable, and yield it after the try/catch.Forepeak
@Rhythmics updated answer with a full example. You should had the try around the new anyway, that is where the exception would happen.Bantustan
awesome, so I can just do foreach (Page in myClass) is my goal. Thank youRhythmics
@Rhythmics One note, using a ArgumentOutOfRangeException is kinda a bad design choice. You should not be using exceptions for your normal program flow control. A better design choice would be do the range math inside GetEnumerator and only call new Page if it is a valid range value.Bantustan
F
0

Use the yield break; statement to end the sequence that your iterator method is generating.

Fauteuil answered 8/7, 2016 at 20:19 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.