Why is the Java Stream.forEach method faster than other loops in certain situations?
Asked Answered
P

1

6

I'm currently taking on a project where I'm measuring the speed of different types of loops in Java using the Java Microbenchmark Harness (JMH) framework. I got some interesting results regarding streams which I can't explain and was wondering if someone who understands streams and Array Lists better could maybe help me explain my results.

Basically, when iterating through Array Lists around sizes of 100, the stream.forEach method is much much faster than any other type of loop:

A graph of my results is shown here: https://i.sstatic.net/W34eA.png

I've tried using Array Lists of both objects and strings and all produce similar results. As the size of the list gets bigger, the stream method is still faster, but the performance gap between other lists gets smaller. I have no idea what is causing these results.

@Fork(5)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Warmup(iterations = 10, time = 100, timeUnit = TimeUnit.MILLISECONDS)
@Measurement(iterations = 10, time = 100, timeUnit = TimeUnit.MILLISECONDS)
@State(Scope.Thread)
public class StackOverflowQ {

    List<Integer> intList;

    int size = 100;

    @Setup
    public void setup() {
        intList = new ArrayList<>(size);
        for(int i = 0; i<size;i++){
            intList.add(i);
        }
    }

    /**
     Work done to each item in the loop.
     */
    public double doWork(int item) {
        return item + 47;
    }

    @Benchmark
    public void standardFor(Blackhole bh){
        for (int i = 0; i<intList.size(); i++){
            bh.consume(doWork(intList.get(i)));
        }
    }

    @Benchmark
    public void streamForEach(Blackhole bh){
        intList.stream().forEach(i -> bh.consume(doWork(i)));
    }

}
Principe answered 9/8, 2019 at 5:32 Comment(5)
have you tried to extract the size() call from the loop (use size instead of intList.size()) - check https://mcmap.net/q/1772596/-for-and-foreach-loops-how-to-create-them-most-efficiently/85421Gamboge
Check out this question which talks about advantages and disadvantages of streams, which may help -stackoverflow.com/questions/44180101Messeigneurs
Can you confirm your stream is sequential()? For the sake of correctness, I would include this explicit call in your benchmark.Puentes
You should also compare with the for(Integer i: intList) bh.consume(doWork(i)); variant. Further, you can use forEach on the ArrayList directly, without a Stream.Previse
How can two methods give four results?Previse
Y
10

This answer talks about java.util.ArrayList. Other Collection implementations might (and do!) show completely different results.

In short: because streams use Spliterator instead of Iterator

Explanation:

Iterator based iteration is based on hasNext and next method calls, the second of which mutates the Iterator instance. This brings some performance costs with it. Spliterator supports bulk iterating. Read more about this here. The enhanced for loops (for (T e : iterable) { ... }) seem to use an Iterator.

Performance should usually be a secondary concern; you should use the constructs that best describe your intentions. While due to backward compatibility reasons it's going to be hard, maybe the performance difference between spliterator based forEach and enhanced for loops (on ArrayList) will disappear in the future.

What about indexed for loops? (List#get(int))
They show worse-than-spliterator performance partially because they need to verify the index. The other reasons probably include method calls eg. to get the data at an index, whereas Spliterator accesses the array directly. But this is pure speculation.

Tiny JMH Benchmark

Below you can see a tiny benchmark that confirms the stated theories. Please note that optimally the benchmark should have been ran for longer.

@State(Scope.Benchmark)
@Fork(value = 2)
@Warmup(iterations = 2, time = 3)
@Measurement(iterations = 2, time = 3)
public class A {
    
    public List<Object> list;
    
    @Setup
    public void setup() {
        list = new ArrayList<>();
        for (int i = 0; i < 1000; i++) list.add(i);
    }
    
    @Benchmark
    public void collectionForEach(Blackhole hole) {
        list.forEach(hole::consume);
    }
    
    @Benchmark
    public void iteratorFor(Blackhole hole) {
        for (Iterator<Object> iterator = list.iterator(); iterator.hasNext(); ) {
            hole.consume(iterator.next());
        }
    }
    
    @Benchmark
    public void enhancedFor(Blackhole hole) {
        for (Object e : list) {
            hole.consume(e);
        }
    }
    
    @Benchmark
    public void iteratorForEach(Blackhole hole) {
        list.iterator().forEachRemaining(hole::consume);
    }
    
    @Benchmark
    public void indexedFor(Blackhole hole) {
        for (int i = 0, size = list.size(); i < size; i++) {
            hole.consume(list.get(i));
        }
    }
    
    @Benchmark
    public void streamForEach(Blackhole hole) {
        list.stream().forEach(hole::consume);
    }
    
    @Benchmark
    public void spliteratorForEach(Blackhole hole) {
        list.spliterator().forEachRemaining(hole::consume);
    }
}

And the results. "ops/s" means operations/second, higher is better.

Benchmark              Mode  Cnt       Score      Error  Units
A.collectionForEach   thrpt    4  158047.064 ±  959.534  ops/s
A.iteratorFor         thrpt    4  177338.462 ± 3245.129  ops/s
A.enhancedFor         thrpt    4  177508.037 ± 1629.494  ops/s
A.iteratorForEach     thrpt    4  184919.605 ± 1922.114  ops/s
A.indexedFor          thrpt    4  193318.529 ± 2715.611  ops/s
A.streamForEach       thrpt    4  280963.272 ± 2253.621  ops/s
A.spliteratorForEach  thrpt    4  283264.539 ± 3055.967  ops/s
Youngman answered 31/12, 2020 at 21:59 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.