How do you design an enumerator that returns (theoretically) an infinite amount of items?
Asked Answered
S

4

6

I'm writing code that looks similar to this:

public IEnumerable<T> Unfold<T>(this T seed)
{
    while (true)
    {
        yield return [next (T)object in custom sequence];
    }
}

Obviously, this method is never going to return. (The C# compiler silently allows this, while R# gives me the warning "Function never returns".)

Generally speaking, is it bad design to provide an enumerator that returns an infinite number of items, without supplying a way to stop enumerating?

Are there any special considerations for this scenario? Mem? Perf? Other gotchas?

If we always supply an exit condition, which are the options? E.g:

  • an object of type T that represents the inclusive or exclusive boundary
  • a Predicate<T> continue (as TakeWhile does)
  • a count (as Take does)
  • ...

Should we rely on users calling Take(...) / TakeWhile(...) after Unfold(...)? (Maybe the preferred option, since it leverages existing Linq knowledge.)

Would you answer this question differently if the code was going to be published in a public API, either as-is (generic) or as a specific implementation of this pattern?

Stop answered 13/10, 2008 at 11:31 Comment(1)
BTW, you could find this interesting in regard to your question: blogs.msdn.com/b/wesdyer/archive/2007/02/13/…Corwin
D
7

So long as you document very clearly that the method will never finish iterating (the method itself returns very quickly, of course) then I think it's fine. Indeed, it can make some algorithms much neater. I don't believe there are any significant memory/perf implications - although if you refer to an "expensive" object within your iterator, that reference will be captured.

There are always ways of abusing APIs: so long as your docs are clear, I think it's fine.

Defame answered 13/10, 2008 at 11:33 Comment(4)
Apart from documentation, I think the method name should if possible indicate that the enumeration is infinite. I.e. "IEnumerable<T> GetAllPrimeNumbers()" is fine, but I'm not sure about a name like Unfold<T>.Palladous
"Unfold" is the name of the higher order function, which is the inverse of "fold" and, IMO, a valid name in a generic context. As for specific implementations, yes, better naming is desirable.Stop
Btw, the Aggregate method is the Linq implementation of the "fold" function, and to my knowledge there's no built-in inverse implementation. Please correct me if I'm wrong.Stop
SelectMany is probably the closest to Unfold, but it's not a direct mapping. It will let you easily unfold a sequence of sequences, but not a sequence of "some sequences and some elements".Defame
H
6

"Generally speaking, is it bad desing to provide an enumerator that returns an infinite amount of items, without supplying a way to stop enumerating?"

The consumer of the code, can always stop enumerating (using break for example or other means). If your enumerator returns and infinite sequence, that doesn't mean the client of the enumerator is somehow forced to never break enumeration, actually you can't make an enumerator which is guaranteed to be fully enumerated by a client.

Should we rely on users calling Take(...) / TakeWhile(...) after Unfold(...)? (Maybe the preferred option, since it leverages existing Linq knowledge.)

Yes, as long as you clearly specify in your documentation that the enumerator returns and infinite sequence and breaking of enumeration is the caller's responsibility, everything should be fine.

Returning infinite sequences isn't a bad idea, functional programing languages have done it for a long time now.

Halakah answered 13/10, 2008 at 11:47 Comment(0)
F
2

I agree with Jon. Compiler transforms your method to class implementing simple state machine that keeps reference to current value (i.e. value that will be returned via Current property). I used this approach several times to simplify code. If you clearly document method's behavior it should work just fine.

Frentz answered 13/10, 2008 at 12:3 Comment(0)
O
0

I would not use an infinite enumerator in a public API. C# programmers, myself included, are too used to the foreach loop. This would also be consistent with the .NET Framework; notice how the Enumerable.Range and Enumerable.Repeat methods take an argument to limit the number of items in the Enumerable. Microsoft chose to use Enumerable.Repeat(" ", 10) instead of Enumerable.Repeat(" ").Take(10) to avoid the infinite enumeration and I would adhere to their design choices.

Overcloud answered 13/10, 2008 at 12:15 Comment(2)
C# developers should start to learn that enumerators enumerate over sequences (finite or infinite) and not over lists. Finite generators are not appropriate for all classes of problems, sometimes you can't know how many elements you need beforehand.Halakah
The purpose of Repeat/Range are to repeat for a given number of iterations - that's why the number's there. It's perfectly reasonable in other scenarios to have infinite sequences.Defame

© 2022 - 2024 — McMap. All rights reserved.