Why return an enumerator?
Asked Answered
S

6

12

I''m curious about why ruby returns an Enumerator instead of an Array for something that seems like Array is an obvious choice. For example:

'foo'.class
# => String

Most people think of a String as an array of chars.

'foo'.chars.class
# => Enumerator

So why does String#chars return an Enumerable instead of an Array? I'm assuming somebody put a lot of thought into this and decided that Enumerator is more appropriate but I don't understand why.

Saphena answered 9/5, 2012 at 12:4 Comment(6)
What application did you need an array of chars for? Perhaps the surprise is because you aren't solving the problem in a ruby way...Tellurian
@Tellurian - For example let's say I want to list unique chars in a string, it would be nice to do: 'foo'.chars.uniq instead of 'foo'.chars.to_a.uniq. My question is: Is there any real advantage to forcing that extra step?Saphena
@Saphena (or some other user with editing superpowers) please replace one more Enumerable in the text to Enumerator.Dolley
@Dolley - Enumerator is a class that includes Enumerable the mixin. We're talking about the same thing.Saphena
@Saphena This, is ok, but in the sentence "So why does String#chars return an Enumerable ..." it is nonsense. String#chars returns Enumerator not Enumerable. Or you can say. String#chars returns object whose class includes Enumerable. That would be ok, but not very straightforward.Dolley
'foo'.chars.class #=> Array in Ruby 2.4.1:). Not sure what version they changed it though.Coincidentally
N
2

This completely in accordance with the spirit of 1.9: to return enumerators whenever possible. String#bytes, String#lines, String#codepoints, but also methods like Array#permutation all return an enumerator.

In ruby 1.8 String#to_a resulted in an array of lines, but the method is gone in 1.9.

Noheminoil answered 9/5, 2012 at 15:3 Comment(4)
Ok, but then why do methods like split and scan still return Arrays?Saphena
Otherwise backward compatibility would have been broken bigtime.Noheminoil
This is the right answer but I think it was a bad call. String#chars should return an Array.Saphena
It looks like #chars and #lines were changed in Ruby 2.0 to return an Array instead of an Enumerator. (You can use each_char and each_line if you want an Enumerator.) See ruby-doc.org/core-2.0/String.html#method-i-chars and ruby-doc.org/core-2.0/String.html#method-i-lines.Acetanilide
R
7

If you want an Array, call #to_a. The difference between Enumerable and Array is that one is lazy and the other eager. It's the good old memory (lazy) vs. cpu (eager) optimization. Apparently they chose lazy, also because

str = "foobar"
chrs = str.chars
chrs.to_a # => ["f", "o", "o", "b", "a", "r"]
str.sub!('r', 'z')
chrs.to_a # => ["f", "o", "o", "b", "a", "z"]
Rocca answered 9/5, 2012 at 12:10 Comment(1)
I like this explanation but I'm not sure I can imagine a situation where I would call String#chars and not use it in such a way that it wouldn't incur the extra overhead of being an Array already.Saphena
M
5
  1. Abstraction - the fact that something may be an Array is an implementation detail you don't care about for many use cases. For those where you do, you can always call .to_a on the Enumerable to get one.

  2. Efficiency - Enumerators are lazy, in that Ruby doesn't have to build the entire list of elements all at once, but can do so one at a time as needed. So only the number you need is actually computed. Of course, this leads to more overhead per item, so it's a trade-off.

  3. Extensibility - the reason chars returns an Enumerable is because it is itself implemented as an enumerator; if you pass a block to it, that block will be executed once per character. That means there's no need for e.g. .chars.each do ... end; you can just do .chars do ... end. This makes it easy to construct operation chains on the characters of the string.

Mushroom answered 9/5, 2012 at 12:15 Comment(7)
1. Makes no sense. The fact that it returns an Enumerator is just as much an implementation detail as the fact that it returns an Array. Regarding 2: Enumerators are lazy, not can be. Note that despite the question title, you can see from the code sample that the return type is Enumerator, not Enumerable (which would be impossible).Soakage
@sepp2k: had already fixed the verb in 2 before you commented; now I fixed the typo, too. Thanks. My point with 1 is simply that there are lots of things that implement Enumerable; there aren't many classes that go to the trouble of duck-typing themselves to look like Arrays. In that sense, Enumerators are more general, and therefore more 'abstract'.Mushroom
Yes, there are many things that implement Enumerable. So how is one class that implements Enumerable (Enumerator) more general than using another class that implements Enumerable (Array)?Soakage
@sepp2k: as long as you're treating the result as Enumerable, either is fine. But if you assume you're getting an Array and treat it as an Array, then it's no longer very general. Since OP was querying the class of the result, I thought I'd throw that difference in.Mushroom
How is that different from assuming you're getting an Enumerator and treating it as an Enumerator?Soakage
...it's different because there are lots of different Enumerable objects you could get that still fulfill your assumption, but only one way to get an Array. This discussion should go to chat if it's going to continue, though.Mushroom
let us continue this discussion in chatSoakage
N
2

This completely in accordance with the spirit of 1.9: to return enumerators whenever possible. String#bytes, String#lines, String#codepoints, but also methods like Array#permutation all return an enumerator.

In ruby 1.8 String#to_a resulted in an array of lines, but the method is gone in 1.9.

Noheminoil answered 9/5, 2012 at 15:3 Comment(4)
Ok, but then why do methods like split and scan still return Arrays?Saphena
Otherwise backward compatibility would have been broken bigtime.Noheminoil
This is the right answer but I think it was a bad call. String#chars should return an Array.Saphena
It looks like #chars and #lines were changed in Ruby 2.0 to return an Array instead of an Enumerator. (You can use each_char and each_line if you want an Enumerator.) See ruby-doc.org/core-2.0/String.html#method-i-chars and ruby-doc.org/core-2.0/String.html#method-i-lines.Acetanilide
T
1

'Most people think of a String as an array of chars' ... only if you think like C or other languages. IMHO, Ruby's object orientation is much more advanced than that. Most Array operations tend to be more Enumerable like, so it probably makes more sense that way.

An array is great for random access to different indexes, but strings are rarely accessed by a particular index. (and if you are trying to to access a particular index, I suspect you are probably doing school work)

If you are trying to inspect each character, Enumerable works. With Enumberable, you have access to map, each, inject, among others. Also for substitution, there are string functions and regular expressions.

Frankly, I can't think of a real world need for an array of chars.

Tellurian answered 9/5, 2012 at 13:5 Comment(2)
I don't really buy this explanation. Returning an Array is more consistent with the 'Principle of Least Astonishment' than a 'reference' to an object which may change before you get a chance to use it. But that's just my opinion.Saphena
Must be, I never found it astonishing at all. You stated the issue already. You assume that a string is an array of chars. That's true in C and some other C like languages, but it's a false assumption in Ruby.Tellurian
I
0

Maybe a string in ruby is mutable? Then having an Array isn't really an obvious choice - the length could change, for instance. But you will still want to enumerate the characters...

Also, you don't really want to be passing around the actual storage for the characters of a string, right? I mean, I don't remember much ruby (it's been a while), but if I were designing the interface, I'd only hand out "copies" for the .chars method/attribute/whatever. Now... Do you want to allocate a new array each time? Or just return a little object that knows how to enumerate the characters in the string? Thus, keeping the implementation hidden.

So, no. Most people don't think of a string as an array of chars. Most people think of a string as a string. With a behavior defined by the library/language/runtime. With an implementation you only need to know when you want to get nasty and all private with stuff below the abstraction belt.

Ioneionesco answered 9/5, 2012 at 12:9 Comment(0)
I
0

Actually 'foo'.chars passes each character in str to the given block, or returns an enumerator if no block is given.

Check it :

irb(main):017:0> 'foo'.chars
=> #<Enumerable::Enumerator:0xc8ab35 @__args__=[], @__object__="foo", @__method__=:chars>
irb(main):018:0> 'foo'.chars.each {|p| puts p}
f
o
o
=> "foo"
Impressible answered 9/5, 2012 at 12:13 Comment(1)
Hmm, I believe that's Enumerable#each that you're considering.Saphena

© 2022 - 2024 — McMap. All rights reserved.