AtomicBoolean vs synchronized block
Asked Answered
T

2

9

I was trying to cut thread contention in my code by replacing some synchronized blocks with AtomicBoolean.

Here's an example with synchronized:

public void toggleCondition() {
    synchronized (this.mutex) {
        if (this.toggled) {
            return;
        }

        this.toggled = true;
        // do other stuff
    }
}

And the alternative with AtomicBoolean:

public void toggleCondition() {
    if (!this.condition.getAndSet(true)) {
        // do other stuff
    }
}

Taking advantage of AtomicBoolean's CAS property should be way faster than relying on synchronization so I ran a little micro-benchmark.

For 10 concurrent threads and 1000000 iterations, AtomicBoolean comes in only slightly faster than synchronized block.

Average time (per thread) spent on toggleCondition() with AtomicBoolean: 0.0338

Average time (per thread) spent on toggleCondition() with synchronized: 0.0357

I know micro-benchmarks are worth what they're worth but shouldn't the difference be higher?

Tristich answered 3/10, 2010 at 0:33 Comment(3)
You may have had very little contention in the original code to begin with. What led you to thinking this needed to be optimized?Ruthie
Running performance tests on a lib I wrote. I had a default implementation using synchronized blocks and manual wait()/notify() synchronization. I then tried another version of it using only java.util.concurrent stuff and I got worst results. Here are the classes: github.com/brunodecarvalho/hotpotato/blob/master/src/main/java/… and github.com/brunodecarvalho/hotpotato/blob/master/src/main/java/…Tristich
I've come to realize after another micro-benchmark (yes, I do know micro benchmarks are somewhat flawed...) that the problem actually lied in the usage of CountDownLatch over wait()/notify(). Besides, that AtomicBoolean approach wasn't completely safe so I dropped it.Tristich
F
6

I know micro-benchmarks are worth what they're worth but shouldn't the difference be higher?

I think the problem is in your benchmark. It looks like each thread is going to toggle the condition just once. The benchmark will spend most of its time creating and destroying threads. The chance that any given thread will be toggling a condition at the same time as any other thread is toggling it will be close to zero.

An AtomicBoolean has a performance advantage over primitive locking when there is significant contention for the condition. For an uncontended condition, I'd expect to see little difference.

Change your benchmark so that each thread toggles the condition a few million times. That will guarantee lots of lock contention, and I expect you will see a performance difference.

EDIT

If the scenario you intended to test only involved one toggle per thread (and 10 threads), then it is unlikely that your application would experience contention, and therefore it is unlikely that using AtomicBoolean will make any difference.

At this point, I should ask why you are focusing your attention on this particular aspect. Have you profiled your application and determined that really you have a lock contention problem? Or are you just guessing? Have you been given the standard lecture on the evils of premature optimization yet??

Fp answered 3/10, 2010 at 1:20 Comment(7)
I'm only clocking the time it takes to call toggleCondition().Tristich
I know that. The problem is that the way that the benchmark is written means that there is almost no contention. Change the code so that there is significant contention and I expect you will see a performance difference.Fp
Check my replies on the comment from matt b on the question. In a real use case I was reading worse values for the version using AtomicBoolean; but it seems it was actually CountDownLatch causing the slowdown. Still, how do you suggest I increase contention there?Tristich
Like I said ... change each thread's run method to call toggleCondition() a large number of times.Fp
Kind of goes against the scenario I intended to test (only one toggle could occur) but I'll give it a try...Tristich
Regarding your EDIT, yes I have. I don't have a performance issue, I was just being curious; last I heard it wasn't a bad thing.Tristich
@brunodecarvalho - curiosity is not a bad thing. However, you should focus your curiosity on something that is likely to matter. Otherwise you are just wasting your time ... like hunting for unicorns.Fp
W
3

Looking at the actual implementation, I mean looking at the code is way better than some microbenchmark ( which are less than useless in Java or any other GC runtime ), I am not surprised it isn't "significantly faster". It is basically doing an implicit synchronized section.

/**
 * Atomically sets to the given value and returns the previous value.
 *
 * @param newValue the new value
 * @return the previous value
 */
public final boolean getAndSet(boolean newValue) {
    for (;;) {
        boolean current = get();
        if (compareAndSet(current, newValue))
            return current;
    }
}

/**
 * Atomically sets the value to the given updated value
 * if the current value {@code ==} the expected value.
 *
 * @param expect the expected value
 * @param update the new value
 * @return true if successful. False return indicates that
 * the actual value was not equal to the expected value.
 */
public final boolean compareAndSet(boolean expect, boolean update) {
    int e = expect ? 1 : 0;
    int u = update ? 1 : 0;
    return unsafe.compareAndSwapInt(this, valueOffset, e, u);
}

And then this from com.sun.Unsafe.java

/**
 * Atomically update Java variable to <tt>x</tt> if it is currently
 * holding <tt>expected</tt>.
 * @return <tt>true</tt> if successful
 */
public final native boolean compareAndSwapInt(Object o, long offset,
                                              int expected,
                                              int x);

there is no magic in this, resource contention is a bitch and very complex. That is why using final variables and working with immutable data is so prevalent in real concurrent languages like Erlang. All this complexity that eats CPU time is by passed, or at least shifted somewhere less complex.

Why answered 3/10, 2010 at 1:24 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.