Ruby: Manipulate Iterators?
Asked Answered
S

3

6

I'm having teething problems with Ruby, with regards to creating single-direction, lazily-evaluated, potentially-infinite iterators. Basically, I'm trying to use Ruby like I'd use Haskell lists and, to a lesser extent, Python generators.

It isn't that I don't understand them per se; I just don't know how to use them as casually as other languages, and I'm also unsure about what methods in Ruby will turn them into arrays behind my back, unloading the entire sequence into memory unnecessarily.

And yes, I've been studying the Ruby reference manual. For half an hour in fact, attentively. Or perhaps evidently not.

For example, if I were to implement a card deck, it would look something like this in Python (untested):

# Python 3

from itertools import chain, count

face_ranks =
    dict(
        zip(
            ('jack', 'queen', 'king', 'ace'),
            count(11)))

sorted_deck =
    map(
        lambda suit:
            map(
                lambda rank:
                    {
                        'rank' : rank,
                        'suit' : suit
                    },
                chain(
                    range(2, 11),
                    face_ranks.keys())),
        ('clubs', 'diamonds', 'hearts', 'spades'))

So, how would I do this in Ruby, completely avoiding arrays? Note that the above code uses, to my knowledge, only tuples and generators: at no point is the whole sequence dumped into memory like if I had used an array. I could be wrong about the above code, but you get what I want.

How do I chain iterators (like Python's chain())? How do I generate an iterator of an infinite range (like Python's count())? How can I add an array to an iterator (like passing a tuple to Python's chain()) without converting the whole thing to an array in the process?

I've seen solutions, but they involve arrays or unnecessary complexities like fibers.

In Python I can manipulate and throw around iterators with the same simplicity as arrays. I can almost treat them like Haskell lists, which I'm most comfortable with, and is really what I think in when coding. I'm not comfortable with Ruby arrays, which is why I seek help with its alternative(s).

I've managed to pick up nuggets of information on the internet about it, but I couldn't find any that covers basic manipulation of such data structures in Ruby? Any help?

Sinless answered 8/8, 2011 at 1:50 Comment(0)
B
4

Ruby doesn't seem to have a lot of built-in methods for doing the different things you wanted to do with enumerators, but you can make your own methods. That's what I did here, using Ruby 1.9:

iter.rb

def get_enums_from_args(args)
  args.collect { |e| e.is_a?(Enumerator) ? e.dup : e.to_enum }
end

def build(y, &block)
  while true
    y << (begin yield; rescue StopIteration; break; end)
  end
end

def zip(*args)
  enums = get_enums_from_args args
  Enumerator.new do |y|
    build y do
      enums.collect { |e| e.next }
    end
  end
end

def chain(*args)
  enums = get_enums_from_args args
  Enumerator.new do |y|
    enums.each do |e|
      build y do
        e.next
      end
    end
  end
end

def multiply(*args)
  enums = get_enums_from_args args
  duped_enums = enums.collect { |e| e.dup }
  Enumerator.new do |y|
    begin
      while true
        y << (begin; enums.collect { |e| e.peek }; rescue StopIteration; break; end )

        index = enums.length - 1
        while true
          begin
            enums[index].next
            enums[index].peek
            break
          rescue StopIteration
            # Some iterator ran out of items.

            # If it was the first iterator, we are done,
            raise if index == 0

            # If it was a different iterator, reset it
            # and then look at the iterator before it.
            enums[index] = duped_enums[index].dup
            index -= 1
          end
        end
      end
    rescue StopIteration
    end
  end
end

And I wrote a spec using rspec to test the functions and demonstrate what they do:

iter_spec.rb:

require_relative 'iter'

describe "zip" do
  it "zips together enumerators" do
    e1 = "Louis".chars
    e2 = "198".chars
    zip(e1,e2).to_a.should == [ ['L','1'], ['o','9'], ['u','8'] ]
  end

  it "works with arrays too" do
    zip([1,2], [:a, nil]).to_a.should == [ [1,:a], [2,nil] ]
  end
end

describe "chain" do
  it "chains enumerators" do
    e1 = "Jon".chars
    e2 = 0..99999999999
    e = chain(e1, e2)
    e.next.should == "J"
    e.next.should == "o"
    e.next.should == "n"
    e.next.should == 0
    e.next.should == 1
  end
end

describe "multiply" do
  it "multiplies enumerators" do
    e1 = "ABC".chars
    e2 = 1..3
    multiply(e1, e2).to_a.should == [["A", 1], ["A", 2], ["A", 3], ["B", 1], ["B", 2], ["B", 3], ["C", 1], ["C", 2], ["C", 3]]
  end

  it "is lazily evalutated" do
    e1 = 0..999999999
    e2 = 1..3
    e = multiply(e1, e2)
    e.next.should == [0, 1]
    e.next.should == [0, 2]
    e.next.should == [0, 3]
    e.next.should == [1, 1]
    e.next.should == [1, 2]
  end

  it "resulting enumerator can not be cloned effectively" do
    ranks = chain(2..10, [:jack, :queen, :king, :ace])
    suits = [:clubs, :diamonds, :hearts, :spades]
    cards = multiply(suits, ranks)
    c2 = cards.clone
    cards.next.should == [:clubs, 2]
    c2.next.should == [:clubs, 2]
    c2.next.should == [:clubs, 3]
    c2.next.should == [:clubs, 4]
    c2.next.should == [:clubs, 5]
    cards.next.should == [:clubs, 6]
  end

  it "resulting enumerator can not be duplicated after first item is evaluated" do
    ranks = chain(2..10, [:jack, :queen, :king, :ace])
    suits = [:clubs, :diamonds, :hearts, :spades]
    cards = multiply(ranks, suits)
    cards.peek
    lambda { cards.dup }.should raise_error TypeError
  end
end

As shown in the specs above, these methods use lazy evaluation.

Also, the major weakness of the zip, chain, and multiply functions defined here is that the resulting enumerator can not be easily duplicated or cloned because we haven't written any code to duplicate the enum arguments that these new enumerators rely on. You would probably need to make a subclass of Enumerator or make a class the includes the Enumerable module or something like that to make dup work nicely.

Bilocular answered 8/8, 2011 at 4:18 Comment(0)
T
2

It seems like you're avoiding Ruby arrays out of performance anxiety, perhaps due to experience you've had with arrays in other languages. You needn't avoid Ruby arrays — they are the closest thing you will get to a tuple in Ruby.

foo = 1, 2, 3, 4
foo.class       #=> Array

It looks like you're looking for a Range in place of a generator:

range = 1..4
range.class     #=> Range
range.count     #=> 4

('a'..'z').each { |letter| letter.do_something }

A range isn't converted to an array, but it does include Enumerable so you can use all of your regular enumerators. As far as looping/iteration — the native looping in Ruby is through Enumerable. for i in group is actually syntactic sugar for enumerator loops (like .each). Enumerable methods usually return the sender, so you can chain them:

(1..10).map { |n| n * 2 }.each { |n| print "##{n}" }
# outputs #2#4#6#8#10#12#14#16#18#20
# returns an array:
#=> [2, 4, 6, 8, 10, 12, 14, 16, 18, 20]

I'd love to give you more specific answers about your Python»Ruby equivalents, but I'm unfamiliar with Python.

Update

You can zip ranges together into a nested array like this:

(1..26).zip('a'..'z') #=> [[1, 'a'], [2, 'b'], ...]

…but Ranges aren't mutable. You can convert a range to an array with (1..5).to_a, or you can iterate over it like I showed above. In the case that you have multiple defined ranges of data that you want to test for inclusion you can use a couple ranges and a map:

allowed = 'a'..'z', 1..100
input = # whatever
allowed.each do |range|
  return false unless range.cover? input
end

Of course you can always use enumerators with ranges to "generate" values on the fly.

Threap answered 8/8, 2011 at 2:15 Comment(4)
Thanks for the help. I was looking for operations like zipping and appending; can you do such things on ranges? If not, what is the Ruby equivilent for performing such operations on Range-like data?Sinless
Hmm... that zipping example returns an array, which means it couldn't have been lazily evaluated, right? That means the result would have been computed in its entiety regardless of whether I wanted to evaluate all the resulting elements or not. This is why I'm not a fan of arrays, but I wonder if there was a version of that zip method that returned a lazy enumerator rather than an eagerly-evaluated array? Thanks for the help so far.Sinless
I'm not sure on the implementation details of zip. You can't really mutate data without it being concrete or you're really just defining a contract for data to be mutated on request. Ruby might have something in the Standard Library for this already, but I'm not familiar with data processing in Ruby. You can also include Enumerable into your own class and use more involved looping to achieve what you want more effectively on top of Ruby (versus with a native structure). You might also get a lot out of looking at the ActiveRecord library that does a LOT of lazy querying.Threap
I've been looking at Ruby's library some more, and it seems that whilst Python's sequence manipulation functions are genralized to any iterable, Ruby has hard-coded quite a lot them to specific sequence types. That was actually really suprising...Sinless
H
2

The closest equivalent in Ruby is an Enumerator. It lets you do lazy generators.

Hardshell answered 8/8, 2011 at 2:25 Comment(1)
But TBH I wouldn't want to use a full blown generator-equivilent for something as trivial as concatenating or zipping multiple ranges.Sinless

© 2022 - 2024 — McMap. All rights reserved.