Why does Iterable<T> not provide stream() and parallelStream() methods?
Asked Answered
M

3

272

I am wondering why the Iterable interface does not provide the stream() and parallelStream() methods. Consider the following class:

public class Hand implements Iterable<Card> {
    private final List<Card> list = new ArrayList<>();
    private final int capacity;

    //...

    @Override
    public Iterator<Card> iterator() {
        return list.iterator();
    }
}

It is an implementation of a Hand as you can have cards in your hand while playing a Trading Card Game.

Essentially it wraps a List<Card>, ensures a maximum capacity and offers some other useful features. It is better as implementing it directly as a List<Card>.

Now, for convienience I thought it would be nice to implement Iterable<Card>, such that you can use enhanced for-loops if you want to loop over it. (My Hand class also provides a get(int index) method, hence the Iterable<Card> is justified in my opinion.)

The Iterable interface provides the following (left out javadoc):

public interface Iterable<T> {
    Iterator<T> iterator();

    default void forEach(Consumer<? super T> action) {
        Objects.requireNonNull(action);
        for (T t : this) {
            action.accept(t);
        }
    }

    default Spliterator<T> spliterator() {
        return Spliterators.spliteratorUnknownSize(iterator(), 0);
    }
}

Now can you obtain a stream with:

Stream<Hand> stream = StreamSupport.stream(hand.spliterator(), false);

So onto the real question:

  • Why does Iterable<T> not provide a default methods that implement stream() and parallelStream(), I see nothing that would make this impossible or unwanted?

A related question I found is the following though: Why does Stream<T> not implement Iterable<T>?
Which is oddly enough suggesting it to do it somewhat the other way around.

Make answered 16/4, 2014 at 15:38 Comment(3)
I guess this is a good question for the Lambda Mailing List.Context
Why is it odd to want to iterate over a stream? How else could you possibly break; an iteration? (Ok, Stream.findFirst() might be a solution, but that might not fulfill all needs...)Dunsinane
See also Convert Iterable to Stream using Java 8 JDK for practical workarounds.Craftsman
S
328

This was not an omission; there was detailed discussion on the EG list in June of 2013.

The definitive discussion of the Expert Group is rooted at this thread.

While it seemed "obvious" (even to the Expert Group, initially) that stream() seemed to make sense on Iterable, the fact that Iterable was so general became a problem, because the obvious signature:

Stream<T> stream()

was not always what you were going to want. Some things that were Iterable<Integer> would rather have their stream method return an IntStream, for example. But putting the stream() method this high up in the hierarchy would make that impossible. So instead, we made it really easy to make a Stream from an Iterable, by providing a spliterator() method. The implementation of stream() in Collection is just:

default Stream<E> stream() {
    return StreamSupport.stream(spliterator(), false);
}

Any client can get the stream they want from an Iterable with:

Stream s = StreamSupport.stream(iter.spliterator(), false);

In the end we concluded that adding stream() to Iterable would be a mistake.

Schenk answered 20/4, 2014 at 2:51 Comment(13)
I see, first of all, thanks a lot for answering the question. I am still curious though why an Iterable<Integer> (I think you are talking about?) would want to return an IntStream. Would the iterable then not rather be a PrimitiveIterator.OfInt? Or do you perhaps mean another usecase?Make
Maybe, but you don't really know that either. Iterable is just too general; any default methods we add to Iterable implicitly constrains the contract of Iterable (worse, after the fact). We made it easy to get a stream from an Iterable, by exposing spliterator() (which itself is a generalization of Iterator, so this is not much of a stretch.) But prudence dictated that we stop there.Schenk
Just noting that the last code sample does not compile AFAICT. The supplier variation takes three arguments, not one.Kymberlykymograph
I find it odd that the above logic was supposedly applied to Iterable (I can't have stream() because someone might want it to return IntStream) whereas an equal amount of thought wasn't given to adding the exact same method to Collection (I might want my Collection<Integer>'s stream() to return IntStream also.) Whether it were present on both or absent on both, people would probably have just gotten on with their lives, but because it's present on one and absent on the other, it becomes quite a glaring omission...Hiltner
Composition over inheritance.Sunbathe
Why couldn't Iterable have stream(), then implementors can add intStream() if applicable? Also, I suspect the cases where the issue you describe would occur are rare.Great
Brian McCutchon: That makes more sense to me. It sounds like people just got tired of arguing and decided to play it safe.Satisfaction
While this makes sense, is there a reason why there isn't an alternative static Stream.of(Iterable), which would at least make the method reasonably discoverable by reading the API documentation -- as somebody who has never really worked with the internals of Streams I'd never even looked at StreamSupport, which is described in the documentation as providing "low-level operations" that are "mostly for library writers".Mutation
I totally agree with Jules. a static method Stream.of(Iteratable iter) or Stream.of(Iterator iter) should be added, instead of StreamSupport.stream(iter.spliterator(), false);Periosteum
Agree with Jules, Brian M., skiwi, and user_3380739 -- as someone coming back to Java after many, many years of C#, this seems like an obvious omission. Also, spliterator -- seriously, that's what it's called? Sounds like a 3rd party JS library to me!Yellowstone
Why does java.util.stream and java.util.function have special treatment for primitives? We have IntStream and IntUnaryOperator etc. but not IntList or IntIterable etc. Then there wouldn't be the problem of Iterable<Integer> wanting IntStream return type.Cassy
@wilmolI'm also wondering why Java Collections Framework has without specialization for primitives as Stream and Function do.Lasandralasater
"This was not an omission" Perhaps super pedantic, but it's definitely an omission. It's just an intentional omission, not an accidental one.Rubberneck
C
26

I did an investigation in several of the project lambda mailing lists and I think I found a few interesting discussions.

I have not found a satisfactory explanation so far. After reading all this I concluded it was just an omission. But you can see here that it was discussed several times over the years during the design of the API.

Lambda Libs Spec Experts

I found a discussion about this in the Lambda Libs Spec Experts mailing list:

Under Iterable/Iterator.stream() Sam Pullara said:

I was working with Brian on seeing how limit/substream functionality[1] might be implemented and he suggested conversion to Iterator was the right way to go about it. I had thought about that solution but didn't find any obvious way to take an iterator and turn it into a stream. It turns out it is in there, you just need to first convert the iterator to a spliterator and then convert the spliterator to a stream. So this brings me to revisit the whether we should have these hanging off one of Iterable/Iterator directly or both.

My suggestion is to at least have it on Iterator so you can move cleanly between the two worlds and it would also be easily discoverable rather than having to do:

Streams.stream(Spliterators.spliteratorUnknownSize(iterator, Spliterator.ORDERED))

And then Brian Goetz responded:

I think Sam's point was that there are plenty of library classes that give you an Iterator but don't let you necessarily write your own spliterator. So all you can do is call stream(spliteratorUnknownSize(iterator)). Sam is suggesting that we define Iterator.stream() to do that for you.

I would like to keep the stream() and spliterator() methods as being for library writers / advanced users.

And later

"Given that writing a Spliterator is easier than writing an Iterator, I would prefer to just write a Spliterator instead of an Iterator (Iterator is so 90s :)"

You're missing the point, though. There are zillions of classes out there that already hand you an Iterator. And many of them are not spliterator-ready.

Previous Discussions in Lambda Mailing List

This may not be the answer you are looking for but in the Project Lambda mailing list this was briefly discussed. Perhaps this helps to foster a broader discussion on the subject.

In the words of Brian Goetz under Streams from Iterable:

Stepping back...

There are lots of ways to create a Stream. The more information you have about how to describe the elements, the more functionality and performance the streams library can give you. In order of least to most information, they are:

Iterator

Iterator + size

Spliterator

Spliterator that knows its size

Spliterator that knows its size, and further knows that all sub-splits know their size.

(Some may be surprised to find that we can extract parallelism even from a dumb iterator in cases where Q (work per element) is nontrivial.)

If Iterable had a stream() method, it would just wrap an Iterator with a Spliterator, with no size information. But, most things that are Iterable do have size information. Which means we're serving up deficient streams. That's not so good.

One downside of the API practice outlined by Stephen here, of accepting Iterable instead of Collection, is that you are forcing things through a "small pipe" and therefore discarding size information when it might be useful. That's fine if all you're doing to do is forEach it, but if you want to do more, its better if you can preserve all the information you want.

The default provided by Iterable would be a crappy one indeed -- it would discard size even though the vast majority of Iterables do know that information.

Contradiction?

Although, it looks like the discussion is based on the changes that the Expert Group did to the initial design of Streams which was initially based on iterators.

Even so, it is interesting to notice that in a interface like Collection, the stream method is defined as:

default Stream<E> stream() {
   return StreamSupport.stream(spliterator(), false);
}

Which could be the exact the same code being used in the Iterable interface.

So, this is why I said this answer is probably not satisfactory, but still interesting for the discussion.

Evidence of Refactoring

Continuing with the analysis in the mailing list, it looks like the splitIterator method was originally in the Collection interface, and at some point in 2013 they moved it up to Iterable.

Pull splitIterator up from Collection to Iterable.

Conclusion/Theories?

Then chances are that the lack of the method in Iterable is just an omission, since it looks like they should have moved the stream method as well when they moved the splitIterator up from Collection to Iterable.

If there are other reasons those are not evident. Somebody else has other theories?

Context answered 16/4, 2014 at 15:59 Comment(4)
I appreciate your response, but I disagree with the reasoning there. At the moment that you override the spliterator() of the Iterable, then all issues there are fixed, and you can trivially implement stream() and parallelStream()..Make
@Make That's why I said this is probably not the answer. I am just trying to add to the discussion, because it is difficult to know why the expert group made the decisions they did. I guess all we can do is try to do some forensics in the mailing list and see if we can come up with any reasons.Context
@Make I reviewed other mailing lists and found more evidence for the discussion and perhaps some ideas that help to theorize some diagnosis.Context
Thanks for your efforts, I should really learn how to efficiently split through those mailing lists. It would help if they could be visualized in some... modern way, like a forum or something, because reading plain text emails with quotes in them is not exactly efficient.Make
H
6

If you know the size you could use java.util.Collection which provides the stream() method:

public class Hand extends AbstractCollection<Card> {
   private final List<Card> list = new ArrayList<>();
   private final int capacity;

   //...

   @Override
   public Iterator<Card> iterator() {
       return list.iterator();
   }

   @Override
   public int size() {
      return list.size();
   }
}

And then:

new Hand().stream().map(...)

I faced the same problem and was surprised that my Iterable implementation could be very easily extended to an AbstractCollection implementation by simply adding the size() method (luckily I had the size of the collection :-)

You should also consider to override Spliterator<E> spliterator().

Hereby answered 30/9, 2015 at 20:11 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.