Nim: How to wrap/derive an iterator from another iterator?

Asked 12/4, 2015 at 10:21 Answered 18/5, 2015 at 7:21

Let's assume we have some existingIterator which iterates over elements of an arbitrary type T. What I now want to achieve is to derive a new iterator from existingIterator with a modified behavior. Think of examples like:

Limiting the length of the original iterator, e.g., existingIterator.take(n).
Mapping over the elements, e.g., existingIterator.map(modifier)
Filtering certain elements, e.g., existingIterator.filter(predicate).

In all these cases I simply want to produce yet another iterator so that I can do something like that:

for x in existingIterator.filter(something)
                         .map(modifier)
                         .take(10):
  ...

My general problem is: How can I write a generic iterator or template which takes an existing iterator and returns a modified iterator?

A follow-up question would be why such essential functionality is not in the standard library -- maybe I'm missing something?

Here is what I have tried:

Attempt 1

Let's take the take(n) functionality as an example. My first approach was to use a regular generic iterator:

iterator infinite(): int {.closure.} =
  var i = 0
  while true:
    yield i
    inc i

iterator take[T](it: iterator (): T, numToTake: int): T {.closure.} =
  var i = 0
  for x in it():
    if i < numToTake:
      yield x
    inc i

for x in infinite.take(10):
  echo x

This compiles, but unfortunately, it does not really work: (1) the elements are not properly iterated (they are all just zero, maybe a bug?), (2) it looks like my program is stuck in an endless loop, and (3) it only works for closure iterators, which means that I cannot wrap arbitrary iterators.

Attempt 2

The limitation to closure iterators suggests that this problem actually requires a template solution.

template take[T](it: iterator(): T, numToTake: int): expr {.immediate.} =
  var i = 0
  iterator tmp(): type(it()) =
    for item in it:
      if i < numToTake:
        yield item
        inc i
  tmp

This almost seem to work (i.e., the template compiles). However, if I now call for x in infinite.take(10) I get:

`Error: type mismatch: got (iterator (): int{.closure, gcsafe, locks: 0.})`

I tried to append a () to actually "call" the iterator, but it still doesn't work. So it comes down to the question: How should I construct/return an iterator from a template?

Stilly answered 12/4, 2015 at 10:21 Comment(4)

There is an open issue that might be related to this problem. – Robustious 12/4, 2015 at 19:4

Jon Skeet has a good blog post series on doing this for LINQ in C# - that's really similar to what you're doing (albeit in another language) but with the same overall idea. – Tonita 12/4, 2015 at 20:34

Why couldn't they call this language something else dammit, it's well confusing seeing questions with my name appear on them.. :/ – Schizogenesis 15/4, 2015 at 11:30

@Nim: At least it's not something like John. Or Julia :). – Stilly 15/4, 2015 at 11:41

The problem lies in

for x in infinite.take(10):
  echo x

Or, more specifically, the call infinite.take(10), which we can also write as take(infinite, 10). Unlike Sather, Nim doesn't have once arguments for its iterators, so there isn't a way to distinguish between arguments that should be evaluated once per loop and arguments that should be evaluated once per loop iteration.

In the case of passing a closure iterator as an argument to another closure iterator, that means that a new instance of the infinite iterator with a new environment is created each time you go through the loop. This will make infinite start at zero again and again.

Inline iterators will normally evaluate their arguments only once per loop (and this is the expected behavior in most cases). Closure iterators have to undergo a transformation of their body into a state machine, which changes how they are being called. They also can be used differently: in particular, closure iterators can have multiple call sites, unlike inline iterators; e.g. let iter = ...; iter(someArgument); iter(someOtherArgument). As a result, I am not sure if we are looking at a bug or at intended behavior here.

You can fix this by not passing infinite to take directly, but using let first. There's also a bug in your take code in that the loop does not terminate, which you need to also fix. The resulting code would be something like:

iterator infinite(): int {.closure.} =
  var i = 0
  while true:
    yield i
    inc i

iterator take[T](it: iterator (): T, numToTake: int): T {.closure.} =
  var i = 0
  for x in it():
    if i >= numToTake:
      break
    yield x
    inc i

let inf = infinite
for x in inf.take(10):
  echo x

If you wish to parameterize infinite, this can be done by wrapping the iterator in a template or proc, e.g.:

template infiniteFrom(x: int): (iterator (): int) =
  (iterator (): int =
    var i = x
    while true:
      yield i
      inc i)

...

let inf = infiniteFrom(1)
for x in inf.take(10):
  echo x

Rejection answered 15/4, 2015 at 11:9 Comment(3)

Thanks a lot, very good explanation! I also like your proposal over at Github. In all these example "once" arguments would probably be really helpful. – Stilly 15/4, 2015 at 11:29

A few Nim versions later... What if infinite would also take parameters, in the sense of countUpFrom(1) (which is not callable so a let binding does not work). I still keep getting attempting to call undeclared routine with my iterator experiments. Same issue for inf.take(10).take(9). – Stilly 31/1, 2017 at 22:53

A response would be too long for a comment, so I updated my answer instead. – Rejection 1/2, 2017 at 7:10

I've also tried to add functional methods to Nim, and I've ended up wrapping everything in functions. Please take a look at http://forum.nim-lang.org/t/1230 This way you can assign the iterator to a variable before looping over it with for.

Phore answered 18/5, 2015 at 7:21 Comment(0)

Attempt 1

Attempt 2

Recommended topics

Hot tags