How to stop iteration in a Enumerator::Lazy method?
Asked Answered
O

3

6

I am trying to implement a take_until method for Ruby 2's Enumerator::Lazy class. It should work similar to take_while but instead stop iteration when the yielded block returns true. The result should include the item where the yielded block matches.

My question is how do I signal that the end of the iteration is reached? When using regular Enumerators you can raise the StopIteration error in an each method to signal the end of the iterator. But that doesn't seem to work for lazy enum's:

class Enumerator::Lazy  
  def take_until
    Lazy.new(self) do |yielder, *values|
      yielder << values
      raise StopIteration if yield *values
    end
  end
end

(1..Float::INFINITY).lazy.take_until{ |i| i == 5 }.force

I also tried to break out of the block to no effect. The documentation for Enumerator::Lazy doesn't seem to help either.

Why using take_while is not a valid option.

The main problem with take_while is that by its nature it will attempt to evaluate one more item than you need. In my application the Enumerator doesn't yield numbers, but messages fetched over the network. Trying to evaluate a message that is not there (yet?) is a blocking action which is highly undesirable. This is illustrated by the following contrived example:

enum = Enumerator.new do |y|
  5.times do |i|
    y << i
  end
  sleep
end

enum.lazy.take_while{ |i| i < 5 }.force

To receive the first five items from this enumerator you will need to evaluate the sixth result. This is not as lazy as it could be. In my use case this is undesirable since the process would block.

Providing a pure Ruby implementation of take for Enumerator::Lazy

The standard library includes a take method that does something similar to what I want. It doesn't use a block as a condition but a number, but it does break out of the the iteration once that number is reached instead of evaluating one more item. Following on from the example above:

enum.lazy.take(5).force

This does not get to the 6th item and so does not block. Problem is the version in the standard library is implemented in C and I can't seem to figure out how this could be implemented in pure Ruby. A ruby implementation of that method would be an acceptable answer.

Thanks in advance!

Okhotsk answered 23/12, 2013 at 22:5 Comment(8)
I hate to ask, but couldn't you just use take_while and amend the condition as needed?Scevour
That's a valid question but I think the answer is: no I can't. My use case involves a stream of responses where I specifically want to delimit the sequence once I encounter certain condition. Take_while wouldn't include the matched item itself but instead returns the sequence up to the first 'miss'. Hope that makes sense.Okhotsk
So take_until would be a negated take_while which does one additional yield next?Joyajoyan
@Joyajoyan it looks like it.Autoharp
@Joyajoyan That's correct. In the above example it should return [1, 2, 3, 4, 5].Okhotsk
Another note: If someone could provide a Ruby implementation of take or take_while I think I would be able to derive my answer from that.Okhotsk
Ruby has take_while defined on Enumerable. Enumerator is certainly Enumarable.Joyajoyan
@Joyajoyan take_while on Enumerable is the non lazy version. There is a lazy version defined on Enumerator::Lazy. Both Are implemented in C. I would need a lazy version implemented in Ruby to derive my answer.Okhotsk
S
4

It's an old question, but anyway: as you say, what you really need is a Lazy#take_until, of course Lazy#take_while will need to obtain the next item to decide whether to break or not. I've been unable to implement Lazy#take_until using Lazy#new { ... }, apparently there is no breaking mechanism. That's a possible workaround:

class Enumerator::Lazy  
  def take_until
    Enumerator.new do |yielder|
      each do |value|
        yielder << value
        break if yield(value)
      end
    end.lazy
  end
end
Selinaselinda answered 1/12, 2015 at 11:8 Comment(0)
S
0

Per my comment, methinks amending take_while is the better option (or at least a valid one):

(1..Float::INFINITY).lazy.take_while { |i| i < 6 }.force
=> [1, 2, 3, 4, 5]

For more complex conditions that are less easy to rewrite, add a variable:

found = false
(1..Float::INFINITY).lazy.take_while do |i|
  if i == 5
    found = true
  else
    !found
  end
end.force
=> [1, 2, 3, 4, 5]

You could also define take_while based on that last block, too:

class Enumerator::Lazy
  def take_until
    take_while do |*args|
      if !@found
        @found = yield(*args)
        true
      else
        false
      end
    end
  end
end

Note that it won't needlessly call the block, too:

p (1..20).lazy.take_until{|i| p i; i == 5}.force
p (1..20).lazy.take_until{|i| p i; i == 3}.force
p (1..20).lazy.take_until{|i| p i; i == 8}.force
Scevour answered 24/12, 2013 at 0:17 Comment(7)
The main problem with take_while is that by its nature it will attempt to evaluate one more item than you need. In my application the Enumerator doesn't yield numbers, but messages fetched over the network. Trying to evaluate a message that is not there (yet?) is a blocking action which is highly undesirable. That is why take_while is not a valid option in my scenario.Okhotsk
Well, hence the suggestion to rewrite the condition accordingly. :-| (But see my edited answer in a moment.)Scevour
There… take_until implementation using take_while posted, per my initial answer. Or am I misunderstanding your edit?Scevour
It will still yield one more item than necessary. I don't think there is any way around that when using take_while. For example evaluating enum.lazy.take_until{ |i| i == 4 }.force using the enum from my edit above will block. Thanks for all the effort you are putting into this though.Okhotsk
Err… hold on. I'd expect enum.lazy.take_until{ |i| i == 4 }.force to yield 4 and then immediately stop, personally. You'd want it to stop at 3 without evaluating the block with i such that i == 4? If so, how would that be possible? :-)Scevour
In the sample code I posted, note the examples: it evaluates the block for i until it reaches one that returns true, and then stop. Technically, the block that gets passed to take_while evaluates for the next iteration, but the condition on @found is so that the block that you pass to take_until won't get called in that particular case.Scevour
Your code doesn't immediately stop at 4. It will still go into the next iteration although it won't actually yield within that iteration. Try it. enum.lazy.take_until{ |i| i == 4 }.force will block instead of returning [0, 1, 2, 3, 4] as it is supposed to.Okhotsk
O
0

I just found this implementation. It's not optimal because it will implicitly force the iteration prematurely by internally caching the result.

class Enumerator::Lazy
  def take_until
    if block_given?
      ary = []
      while n = self.next
        ary << n
        if (yield n) == true
          break
        end
      end
      return ary.lazy
    else
      return self
    end
  end
end

Using the example from my question:

enum = Enumerator.new do |y|
  5.times do |i|
    y << i
  end
  sleep
end

p enum.lazy.take_until{ |i| i == 4 }.force

Will now return [0, 1, 2, 3, 4]

I'm keeping this question open a bit longer to see if someone comes up with a truly lazy implementation, but I doubt we will find one.

Okhotsk answered 24/12, 2013 at 16:2 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.