Verify JMH measurements of simple for/lambda comparisons
Asked Answered
P

1

4

I wanted to do some performance measurements and comparisons of simple for loops and equivalent streams implementations. I believe it's the case that streams will be somewhat slower than equivalent non-streams code, but I wanted to be sure I'm measuring the right things.

I'm including my entire jmh class here.

import java.util.ArrayList;
import java.util.List;

import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.Scope;
import org.openjdk.jmh.annotations.Setup;
import org.openjdk.jmh.annotations.State;

@State(Scope.Benchmark)
public class MyBenchmark {
    List<String>    shortLengthListConstantSize     = null;
    List<String>    mediumLengthListConstantSize    = null;
    List<String>    longerLengthListConstantSize    = null;
    List<String>    longLengthListConstantSize      = null;

    @Setup
    public void setup() {
        shortLengthListConstantSize     = populateList(2);
        mediumLengthListConstantSize    = populateList(12);
        longerLengthListConstantSize    = populateList(300);
        longLengthListConstantSize      = populateList(300000);
    }

    private List<String> populateList(int size) {
        List<String> list   = new ArrayList<>();
        for (int ctr = 0; ctr < size; ++ ctr) {
            list.add("xxx");
        }
        return list;
    }

    @Benchmark
    public long shortLengthConstantSizeFor() {
        long count   = 0;
        for (String val : shortLengthListConstantSize) {
            if (val.length() == 3) { ++ count; }
        }
        return count;
    }

    @Benchmark
    public long shortLengthConstantSizeForEach() {
        IntHolder   intHolder   = new IntHolder();
        shortLengthListConstantSize.forEach(s -> { if (s.length() == 3) ++ intHolder.value; } );
        return intHolder.value;
    }

    @Benchmark
    public long shortLengthConstantSizeLambda() {
        return shortLengthListConstantSize.stream().filter(s -> s.length() == 3).count();
    }

    @Benchmark
    public long shortLengthConstantSizeLambdaParallel() {
        return shortLengthListConstantSize.stream().parallel().filter(s -> s.length() == 3).count();
    }

    @Benchmark
    public long mediumLengthConstantSizeFor() {
        long count   = 0;
        for (String val : mediumLengthListConstantSize) {
            if (val.length() == 3) { ++ count; }
        }
        return count;
    }

    @Benchmark
    public long mediumLengthConstantSizeForEach() {
        IntHolder   intHolder   = new IntHolder();
        mediumLengthListConstantSize.forEach(s -> { if (s.length() == 3) ++ intHolder.value; } );
        return intHolder.value;
    }

    @Benchmark
    public long mediumLengthConstantSizeLambda() {
        return mediumLengthListConstantSize.stream().filter(s -> s.length() == 3).count();
    }

    @Benchmark
    public long mediumLengthConstantSizeLambdaParallel() {
        return mediumLengthListConstantSize.stream().parallel().filter(s -> s.length() == 3).count();
    }

    @Benchmark
    public long longerLengthConstantSizeFor() {
        long count   = 0;
        for (String val : longerLengthListConstantSize) {
            if (val.length() == 3) { ++ count; }
        }
        return count;
    }

    @Benchmark
    public long longerLengthConstantSizeForEach() {
        IntHolder   intHolder   = new IntHolder();
        longerLengthListConstantSize.forEach(s -> { if (s.length() == 3) ++ intHolder.value; } );
        return intHolder.value;
    }

    @Benchmark
    public long longerLengthConstantSizeLambda() {
        return longerLengthListConstantSize.stream().filter(s -> s.length() == 3).count();
    }

    @Benchmark
    public long longerLengthConstantSizeLambdaParallel() {
        return longerLengthListConstantSize.stream().parallel().filter(s -> s.length() == 3).count();
    }

    @Benchmark
    public long longLengthConstantSizeFor() {
        long count   = 0;
        for (String val : longLengthListConstantSize) {
            if (val.length() == 3) { ++ count; }
        }
        return count;
    }

    @Benchmark
    public long longLengthConstantSizeForEach() {
        IntHolder   intHolder   = new IntHolder();
        longLengthListConstantSize.forEach(s -> { if (s.length() == 3) ++ intHolder.value; } );
        return intHolder.value;
    }

    @Benchmark
    public long longLengthConstantSizeLambda() {
        return longLengthListConstantSize.stream().filter(s -> s.length() == 3).count();
    }

    @Benchmark
    public long longLengthConstantSizeLambdaParallel() {
        return longLengthListConstantSize.stream().parallel().filter(s -> s.length() == 3).count();
    }

    public static class IntHolder {
        public int value    = 0;
    }
}

I'm running these on a Win7 laptop. I don't care about absolute measurements, just relative. Here are the latest results from these:

Benchmark                                            Mode  Cnt          Score         Error  Units
MyBenchmark.longLengthConstantSizeFor               thrpt  200       2984.554 ±      57.557  ops/s
MyBenchmark.longLengthConstantSizeForEach           thrpt  200       2971.701 ±     110.414  ops/s
MyBenchmark.longLengthConstantSizeLambda            thrpt  200        331.741 ±       2.196  ops/s
MyBenchmark.longLengthConstantSizeLambdaParallel    thrpt  200       2827.695 ±     682.662  ops/s
MyBenchmark.longerLengthConstantSizeFor             thrpt  200    3551842.518 ±   42612.744  ops/s
MyBenchmark.longerLengthConstantSizeForEach         thrpt  200    3616285.629 ±   16335.379  ops/s
MyBenchmark.longerLengthConstantSizeLambda          thrpt  200    2791292.093 ±   12207.302  ops/s
MyBenchmark.longerLengthConstantSizeLambdaParallel  thrpt  200      50278.869 ±    1977.648  ops/s
MyBenchmark.mediumLengthConstantSizeFor             thrpt  200   55447999.297 ±  277442.812  ops/s
MyBenchmark.mediumLengthConstantSizeForEach         thrpt  200   57381287.954 ±  362751.975  ops/s
MyBenchmark.mediumLengthConstantSizeLambda          thrpt  200   15925281.039 ±   65707.093  ops/s
MyBenchmark.mediumLengthConstantSizeLambdaParallel  thrpt  200      60082.495 ±     581.405  ops/s
MyBenchmark.shortLengthConstantSizeFor              thrpt  200  132278188.475 ± 1132184.820  ops/s
MyBenchmark.shortLengthConstantSizeForEach          thrpt  200  124158664.044 ± 1112991.883  ops/s
MyBenchmark.shortLengthConstantSizeLambda           thrpt  200   18750818.019 ±  171239.562  ops/s
MyBenchmark.shortLengthConstantSizeLambdaParallel   thrpt  200     474054.951 ±    1344.705  ops/s

In an earlier question, I confirmed that these benchmarks appear to be "functionally equivalent" (just looking for additional eyes). Do these numbers appear to be in line, perhaps with independent runs of these benchmarks?

Another thing that I've always been uncertain about with JMH output, is determining exactly what the throughput numbers represent. For instance, what does the "200" in the "Cnt" column exactly represent? The throughput units are in "operations per second", so what exactly does the "operation" represent, is that the execution of one call to the benchmark method? For instance, in the last row, that would represent 474k executions of the benchmark method in a second.

Update:

I note that when I compare the "for" with the "lambda", starting with the "short" list and going to longer lists, the ratio between them is pretty large, but decreases, until the "long" list, where the ratio is even larger than for the "short" list (14%, 29%, 78%, and 11%). I find this surprising. I would have expected the ratio of the streams overhead to decrease as the work in the actual business logic increases. Anyone have any thoughts on that?

Polonium answered 26/3, 2019 at 3:58 Comment(2)
Hint: JMH supports setups that do not require to repeat the code for different input.Carine
Acknowledged. I'll incorporate that into the next version of this.Polonium
L
1

For instance, what does the "200" in the "Cnt" column exactly represent?

The cnt column is the number of iterations - i.e. how many times a tests is repeated. You can control that value using the following annotations:

  • For the actual measurements: @Measurement(iterations = 10, time = 50, timeUnit = TimeUnit.MILLISECONDS)
  • For the warmup phase: @Warmup(iterations = 10, time = 1, timeUnit = TimeUnit.SECONDS)

Here iterations is cnt; time is the required duration of one iteration, and timeUnit is the unit of measurement of the time value.

The throughput units are in "operations per second"

You can control the output in several ways. For instance you can change the unit of measurement for the time using @OutputTimeUnit(TimeUnit.XXXX), so you can get ops/us, ops/ms

You can also change the mode: instead of measureing ops/time you can measure "average time", "sample time", etc. You can control this via the @BenchmarkMode({Mode.AverageTime}) annotation

so what exactly does the "operation" represent, is that the execution of one call to the benchmark method

So lets say one iteration is 1 second long and you get 1000 ops/sec. This means that the benchamrk method has been executed 1000 times.

In other words one operation is one execution of the benchmark method, unless you have the @OperationsPerInvocation(XXX) annotation, which means tha teach invocation of the methods will count as XXX operations.

The error is calculated across all iterations.


One more tip: instead of hardcoding each possible size, you can do a parameterized benchmark:

@Param({"3", "12", "300", "3000"})
private int length;

Then you can use that param in your setup:

 @Setup(Level.Iteration)
 public void setUp(){
     populateList(length)
 }
Low answered 26/3, 2019 at 5:16 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.