Java Concurrency: CAS vs Locking [closed]
Asked Answered
D

5

80

I'm reading the Book Java Concurrency in Practice. In chapter 15, they are talking about the nonblocking algorithms and the compare-and-swap (CAS) method.

It is written that CAS perform much better than the locking methods. I want to ask the people who already worked with both of these concepts and would like to hear when you are preferring which one of these concepts? Is it really so much faster?

For me, the usage of locks is much clearer and easier to understand and maybe even better to maintain (please correct me if I am wrong). Should we really focus on creating our concurrent code related to CAS than locks to get a better performance boost or is sustainability more important?

I know there is maybe not a strict rule when to use what. But I just would like to hear some opinions, experiences with the new concept of CAS.

Dabney answered 18/4, 2010 at 22:7 Comment(1)
Usual locking with monitor-objects involves CAS as well as the locking kernel-calls. The cass alleviates the locking so that there will be no kernel call without contention. And when there is contention the CAS might spin for a time and then the kernel call would take place or the kernel call would follow immediately.Interregnum
U
55

CAS is generally much faster than locking, but it does depend on the degree of contention. Because CAS may force a retry if the value changes between reading and comparing, a thread can theoretically get stuck in a busy-wait if the variable in question is being hit hard by many other threads (or if it is expensive to compute a new value from the old value (or both)).

The main issue with CAS is that it is much more difficult to program with correctly than locking. Mind you, locking is, in turn, much harder to use correctly than message-passing or STM, so don't take this as a ringing endorsement for the use of locks.

Ushaushant answered 18/4, 2010 at 22:22 Comment(1)
I agree, putting accent on expense of computing a new value. Often it is so big, making non-locking strategy lose to lock-based.Footsore
F
30

The relative speed of the operations is largely a non-issue. What is relevant is the difference in scalability between lock-based and nonblocking algorithms. And if you're running on a 1 or 2 core system, stop thinking about such things.

Nonblocking algorithms generally scale better because they have shorter "critical sections" than lock-based algorithms.

Festatus answered 29/4, 2010 at 0:1 Comment(0)
A
24

You can look at the numbers between a ConcurrentLinkedQueue and a BlockingQueue. What you will see is that CAS is noticeably faster under moderate (more realistic in real world applications) thread contention.

The most attractive property of nonblocking algorithms is the fact that if one thread fails (cache miss, or worse, seg fault) then other threads will not notice this failure and can move on. However, when acquiring a lock, if the lock holding thread has some kind of OS failure, every other thread waiting for the lock to be freed will be hit with the failure also.

To answer your questions, yes, nonblocking thread-safe algorithms or collections (ConcurrentLinkedQueue, ConcurrentSkipListMap/Set) can be significantly faster than their blocking counterparts. As Marcelo pointed out though, getting nonblocking algorithms correct is very difficult and requires a lot of consideration.

You should read about the Michael and Scott Queue, this is the queue implementation for ConcurrentLinkedQueue and explains how to handle a two-way, thread-safe, atomic function with a single CAS.

Advance answered 18/4, 2010 at 23:0 Comment(1)
The article about the "Michael and Scott Queue" is interesting. Thx!Dabney
D
14

There's good book strongly related to the topic of lock-free concurrency: "The Art of multi-processor programming" by Maurice Herlihy

Deliquesce answered 22/4, 2010 at 10:15 Comment(1)
Also his presentation part 1 and part 2 are very worthwhile.Tropous
M
11

If you are looking for a real world comparison, here is one. Our application has two (2) threads 1) A reader thread for network packet capture and 2) a consumer thread that takes the packet, counts it and reports statistics.

Thread #1 exchanges a single packet at a time with thread #2

Result #1 - uses a custom CAS based exchange using the same principles as SynchronousQueue, where our class is called CASSynchronousQueue:

30,766,538 packets in 59.999 seconds ::  500.763Kpps, 1.115Gbps 0 drops
libpcap statistics: recv=61,251,128, drop=0(0.0%), ifdrop=0

Result #2 - when we replace our CAS implementation with the standard java SynchronousQueue:

8,782,647 packets in 59.999 seconds ::  142.950Kpps, 324.957Mbps 0 drops 
libpcap statistics: recv=69,955,666, drop=52,369,516(74.9%), ifdrop=0

I don't think the difference in performance couldn't be any more clearer.

Mir answered 20/7, 2015 at 3:12 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.