I've had this question for quite a while now, trying to read lots of resources and understanding what is going on - but I've still failed to get a good understanding of why things are the way they are.
Simply put I'm trying to test how a CAS
would perform vs synchronized
in contended and not environments. I've put up this JMH
test:
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Warmup(iterations = 5, time = 5, timeUnit = TimeUnit.SECONDS)
@Measurement(iterations = 5, time = 5, timeUnit = TimeUnit.SECONDS)
@State(Scope.Benchmark)
public class SandBox {
Object lock = new Object();
public static void main(String[] args) throws RunnerException {
Options opt = new OptionsBuilder().include(SandBox.class.getSimpleName())
.jvmArgs("-ea", "-Xms10g", "-Xmx10g")
.shouldFailOnError(true)
.build();
new Runner(opt).run();
}
@State(Scope.Thread)
public static class Holder {
private long number;
private AtomicLong atomicLong;
@Setup
public void setUp() {
number = ThreadLocalRandom.current().nextLong();
atomicLong = new AtomicLong(number);
}
}
@Fork(1)
@Benchmark
public long sync(Holder holder) {
long n = holder.number;
synchronized (lock) {
n = n * 123;
}
return n;
}
@Fork(1)
@Benchmark
public AtomicLong cas(Holder holder) {
AtomicLong al = holder.atomicLong;
al.updateAndGet(x -> x * 123);
return al;
}
private Object anotherLock = new Object();
private long anotherNumber = ThreadLocalRandom.current().nextLong();
private AtomicLong anotherAl = new AtomicLong(anotherNumber);
@Fork(1)
@Benchmark
public long syncShared() {
synchronized (anotherLock) {
anotherNumber = anotherNumber * 123;
}
return anotherNumber;
}
@Fork(1)
@Benchmark
public AtomicLong casShared() {
anotherAl.updateAndGet(x -> x * 123);
return anotherAl;
}
@Fork(value = 1, jvmArgsAppend = "-XX:-UseBiasedLocking")
@Benchmark
public long syncSharedNonBiased() {
synchronized (anotherLock) {
anotherNumber = anotherNumber * 123;
}
return anotherNumber;
}
}
And the results:
Benchmark Mode Cnt Score Error Units
spinLockVsSynchronized.SandBox.cas avgt 5 212.922 ± 18.011 ns/op
spinLockVsSynchronized.SandBox.casShared avgt 5 4106.764 ± 1233.108 ns/op
spinLockVsSynchronized.SandBox.sync avgt 5 2869.664 ± 231.482 ns/op
spinLockVsSynchronized.SandBox.syncShared avgt 5 2414.177 ± 85.022 ns/op
spinLockVsSynchronized.SandBox.syncSharedNonBiased avgt 5 2696.102 ± 279.734 ns/op
In the non-shared case CAS
is by far faster, which I would expect. But in shared case, things are the other way around - and this I can't understand. I don't think this is related to biased locking, as that would happen after a threads holds the lock for 5 seconds (AFAIK) and this does not happen and the test is just proof of that.
I honestly hope it's just my tests that are wrong, and someone having jmh
expertise would come along and just point me to the wrong set-up here.
CAS
. they assume less contention? Or a finer-grained lock that could be taken for far less time? – Antigone@Threads(50)
in at least one test, but this information is missing in the question. It’s not clear what number of threads were used for the results posted in the question. – Antagonisticsync
andsyncShared
as in both cases, you are synchronizing on a shared object. It only differs in the locality of the variable you’re updating, but, of course, it makes no sense to acquire a global lock to update an unshared variable. – Antagonisticsync
is taking aHolder
as input; that is turn is annotated with@State(Scope.Thread)
- each thread that runs the test will have it's own copy of thatHolder
. I think this explains it better : hg.openjdk.java.net/code-tools/jmh/file/b46a93657b82/… – Antigonelock
, which is an instance variable ofSandBox
, containing a shared object, rather than the localHolder
object. – Antagonisticsync
test eludes me. Unlikecas[Shared]
, it reads a shared value, but it does not publish a new one, and the read of the initial value isn't even guarded by the lock. Doesholder.number
even get updated between invocations during the same trial? – Bedelia