I'm trying to understand why it is wise to use Blackhole.consumeCPU()
?
Something I found about Blackhole.consumeCPU()
on Google
Sometimes when we run run a benchmark across multiple threads we also want to burn some cpu cycles to simulate CPU business when running our code. This can't be a Thread.sleep as we really want to burn cpu. The Blackhole.consumeCPU(long) gives us the capability to do this.
My example code:
import java.util.concurrent.TimeUnit;
import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.BenchmarkMode;
import org.openjdk.jmh.annotations.Level;
import org.openjdk.jmh.annotations.Measurement;
import org.openjdk.jmh.annotations.Mode;
import org.openjdk.jmh.annotations.OutputTimeUnit;
import org.openjdk.jmh.annotations.Scope;
import org.openjdk.jmh.annotations.Setup;
import org.openjdk.jmh.annotations.State;
import org.openjdk.jmh.annotations.Warmup;
import org.openjdk.jmh.infra.Blackhole;
import org.openjdk.jmh.runner.Runner;
import org.openjdk.jmh.runner.RunnerException;
import org.openjdk.jmh.runner.options.Options;
import org.openjdk.jmh.runner.options.OptionsBuilder;
@State(Scope.Thread)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
public class StringConcatAvgBenchmark {
StringBuilder stringBuilder1;
StringBuilder stringBuilder2;
StringBuffer stringBuffer1;
StringBuffer stringBuffer2;
String string1;
String string2;
/*
* re-initializing the value after every iteration
*/
@Setup(Level.Iteration)
public void init() {
stringBuilder1 = new StringBuilder("foo");
stringBuilder2 = new StringBuilder("bar");
stringBuffer1 = new StringBuffer("foo");
stringBuffer2 = new StringBuffer("bar");
string1 = new String("foo");
string2 = new String("bar");
}
@Benchmark
@Warmup(iterations = 10)
@Measurement(iterations = 100)
@BenchmarkMode(Mode.AverageTime)
public StringBuilder stringBuilder() {
// operation is very thin and so consuming some CPU
Blackhole.consumeCPU(100);
return stringBuilder1.append(stringBuilder2);
// to avoid dead code optimization returning the value
}
@Benchmark
@Warmup(iterations = 10)
@Measurement(iterations = 100)
@BenchmarkMode(Mode.AverageTime)
public StringBuffer stringBuffer() {
Blackhole.consumeCPU(100);
// to avoid dead code optimization returning the value
return stringBuffer1.append(stringBuffer2);
}
@Benchmark
@Warmup(iterations = 10)
@Measurement(iterations = 100)
@BenchmarkMode(Mode.AverageTime)
public String stringPlus() {
Blackhole.consumeCPU(100);
return string1 + string2;
}
@Benchmark
@Warmup(iterations = 10)
@Measurement(iterations = 100)
@BenchmarkMode(Mode.AverageTime)
public String stringConcat() {
Blackhole.consumeCPU(100);
// to avoid dead code optimization returning the value
return string1.concat(string2);
}
public static void main(String[] args) throws RunnerException {
Options options = new OptionsBuilder()
.include(StringConcatAvgBenchmark.class.getSimpleName())
.threads(1).forks(1).shouldFailOnError(true).shouldDoGC(true)
.jvmArgs("-server").build();
new Runner(options).run();
}
}
Why are the results of this Benchmark better with the blackhole.consumeCPU(100)
?
UPDATE:
Output with blackhole.consumeCPU(100)
:
Benchmark Mode Cnt Score Error Units
StringBenchmark.stringBuffer avgt 10 398,843 ± 38,666 ns/op
StringBenchmark.stringBuilder avgt 10 387,543 ± 40,087 ns/op
StringBenchmark.stringConcat avgt 10 410,256 ± 33,194 ns/op
StringBenchmark.stringPlus avgt 10 386,472 ± 21,704 ns/op
Output without blackhole.consumeCPU(100)
:
Benchmark Mode Cnt Score Error Units
StringBenchmark.stringBuffer avgt 10 51,225 ± 19,254 ns/op
StringBenchmark.stringBuilder avgt 10 49,548 ± 4,126 ns/op
StringBenchmark.stringConcat avgt 10 50,373 ± 1,408 ns/op
StringBenchmark.stringPlus avgt 10 87,942 ± 1,701 ns/op
I think I know now why they used this, because the benchmarks are too quick without some delay.
With blackhole.consumeCPU(100)
you can measure each benchmark better and receive more significant results.
Is that right ?