Why is the sum of reciprocals using a for-loop ~400x faster than streams?

This code is benchmarking 3 different ways to compute the sum of the reciprocals of the elements of a double[].

a for-loop
Java 8 streams
the colt math library

What is the reason that the computation using a simple for-loop is ~400 times faster than the one using streams? (Or is there anything needs to be improved in the benchmarking code? Or a faster way of computing this using streams?)

Code :

import java.util.Arrays;
import java.util.List;
import java.util.Map;
import java.util.concurrent.TimeUnit;
import java.util.stream.Collectors;
import java.util.stream.IntStream;
import cern.colt.list.DoubleArrayList;
import cern.jet.stat.Descriptive;
import org.openjdk.jmh.annotations.*;

@State(Scope.Thread)
public class MyBenchmark {

    public static double[] array;

    static {
        int num_of_elements = 100;
        array = new double[num_of_elements];
        for (int i = 0; i < num_of_elements; i++) {
            array[i] = i+1;
        }
    }

    @Benchmark
    @BenchmarkMode(Mode.AverageTime)
    @OutputTimeUnit(TimeUnit.NANOSECONDS)
    public void testInversionSumForLoop(){
        double result = 0;
        for (int i = 0; i < array.length; i++) {
            result += 1.0/array[i];
        }
    }

    @Benchmark
    @BenchmarkMode(Mode.AverageTime)
    @OutputTimeUnit(TimeUnit.NANOSECONDS)
    public void testInversionSumUsingStreams(){
        double result = 0;
        result = Arrays.stream(array).map(d -> 1/d).sum();
    }

    @Benchmark
    @BenchmarkMode(Mode.AverageTime)
    @OutputTimeUnit(TimeUnit.NANOSECONDS)
    public void testInversionSumUsingCernColt(){
        double result = Descriptive.sumOfInversions(new DoubleArrayList(array), 0, array.length-1);
    }
}

Results:

/**
 * Results
 * Benchmark                                  Mode  Cnt    Score    Error  Units
 * MyBenchmark.testInversionSumForLoop        avgt  200    1.647 ±  0.155  ns/op
 * MyBenchmark.testInversionSumUsingCernColt  avgt  200  603.254 ± 22.199  ns/op
 * MyBenchmark.testInversionSumUsingStreams   avgt  200  645.895 ± 20.833  ns/o
 */

Update: these results show Blackhome.consume or return is necessary to avoid jvm optimization.

/**
 * Updated results after adding Blackhole.consume
 * Benchmark                                  Mode  Cnt    Score    Error  Units
 * MyBenchmark.testInversionSumForLoop        avgt  200  525.498 ± 10.458  ns/op
 * MyBenchmark.testInversionSumUsingCernColt  avgt  200  517.930 ±  2.080  ns/op
 * MyBenchmark.testInversionSumUsingStreams   avgt  200  582.103 ±  3.261  ns/op
 */

oracle jdk version "1.8.0_181", Darwin Kernel Version 17.7.0

@State(Scope.Thread) @Warmup(iterations = 10, time = 200, timeUnit = MILLISECONDS) @Measurement(iterations = 20, time = 500, timeUnit = MILLISECONDS) @BenchmarkMode(Mode.AverageTime) @OutputTimeUnit(TimeUnit.NANOSECONDS) public class MyBenchmark { static double[] array; static { int num_of_elements = 100; array = new double[num_of_elements]; for (int i = 0; i < num_of_elements; i++) { array[i] = i + 1; } } double result = 0; @Benchmark public void baseline(Blackhole blackhole) { result = 1; result = result / 1.0; blackhole.consume(result); } @Benchmark public void testInversionSumForLoop(Blackhole blackhole) { for (int i = 0; i < array.length; i++) { result += 1.0 / array[i]; } blackhole.consume(result); } @Benchmark public void testInversionSumUsingStreams(Blackhole blackhole) { result = Arrays.stream(array).map(d -> 1 / d).sum(); blackhole.consume(result); } }

Benchmark Mode Cnt Score Error Units MyBenchmark.baseline avgt 100 2.437 ± 0.139 ns/op MyBenchmark.testInversionSumForLoop avgt 100 135.512 ± 13.080 ns/op MyBenchmark.testInversionSumUsingStreams avgt 100 506.479 ± 4.209 ns/o

openjdk version "11.0.1" 2018-10-16 OpenJDK Runtime Environment 18.9 (build 11.0.1+13) OpenJDK 64-Bit Server VM 18.9 (build 11.0.1+13, mixed mode) Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz Linux 4.15.0-43-generic #46-Ubuntu SMP Thu Dec 6 14:45:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

Recommended topics

Hot tags