This code uses deep Java Memory Model voodoo, as it mixes both locks and volatiles.
The lock usage in this code is easy to dispense with, though. Locking provides memory ordering among threads that use the same lock. Specifically, the unlock at the end of this method provides happens-before semantics with other threads that acquire the same lock. Other code paths through this class, though, don't use this lock at all. Therefore, the memory model implications for the lock are irrelevant to those code paths.
Those other code paths do use volatile reads and writes, specifically to the array
field. The getArray
method does a volatile read of this field, and the setArray
method method does a volatile write of this field.
The reason this code calls setArray
even when it's apparently unnecessary is so that it establishes an invariant for this method that it always performs a volatile write to this array. This establishes happens-before semantics with other threads that perform volatile reads from this array. This is important because the volatile write-read semantics apply to reads and writes other than those of the volatile field itself. Specifically, writes to other (non-volatile) fields before a volatile write happen-before reads from those other fields after a volatile read of the same volatile variable. See the JMM FAQ for an explanation.
Here's an example:
// initial conditions
int nonVolatileField = 0;
CopyOnWriteArrayList<String> list = /* a single String */
// Thread 1
nonVolatileField = 1; // (1)
list.set(0, "x"); // (2)
// Thread 2
String s = list.get(0); // (3)
if (s == "x") {
int localVar = nonVolatileField; // (4)
}
Let's assume that line (3) gets the value set by line (2), the interned string "x"
. (For the sake of this example we use identity semantics of interned strings.) Assuming this is true, then the memory model guarantees that the value read at line (4) will be 1 as set by line (1). This is because the volatile write at (2), and every earlier write, happen-before the volatile read at line (3), and every subsequent read.
Now, suppose that the initial condition were that the list already contained a single element, the interned string "x"
. And further suppose that the set()
method's else
clause didn't make the setArray
call. Now, depending on the initial contents of the list, the list.set()
call at line (2) might or might not perform a volatile write, therefore the read at line (4) might or might not have any visibility guarantees!
Clearly you don't want these memory visibility guarantees to depend upon the current contents of the list. To establish the guarantee in all cases, set()
needs to do a volatile write in all cases, and that's why it calls setArray()
even if it didn't do any writing itself.
EDIT 2022-07-13
Holger raised an interesting issue in the comments:
If by the time, thread 1 does list.set(0, "x");
, the first element is already "x", the scenario we’re talking about, then the thread 2 can not assume that list.get(0) == "x"
proved that thread 1 did perform list.set(0, "x");
, as the condition is always fulfilled whether thread 2’s read was subsequent to thread 1’s write or not. So if the element doesn’t change, there is no happens-before relationship between (1) and (4) here. The redundant setArray call isn’t enforcing a memory visibility either, as the reader thread could have read the array reference right before that write.
It is true that, looking only at this code in isolation, there is no guarantee that the set
at (2) is performed before the get
at (3). However, these are operations on a volatile
variable, and as such, they are synchronization actions. Under the JMM, synchronization actions have a total order. That is, they will occur in some order, but we don't know which one. The operations could occur in the order (2)->(3) or (3)->(2); there are no other possibilities.
If the order is (3)->(2) then Holger is correct, there is no happens-before relationship, and a subsequent read such as (4) could get a stale value.
However, if the order is (2)->(3) then there is a happens-before relationship, and the read at (4) is guaranteed to see the write at (1).
Isn't this pointless, though, since we can't guarantee the order in which the synchronization actions are performed? And to establish that order, usually we would use some synchronization operations between the threads, which would provide the necessary memory visibility guarantees. Doesn't that make the unconditional volatile write at (2) useless?
Not necessarily. There are mechanisms external to the system, such as timers, network messages, or user interaction, that can clearly establish ordering between certain operations, but that don't establish memory visibility. For example, suppose Thread 1 performs its operation frequently (say, once per second) and Thread 2 performs its operation (say, once per minute). Our application might want Thread 2 to get some recent value, though not necessarily the absolute most recent value. The volatile write performed repeatedly by Thread 1 (and the corresponding volatile read by Thread 2) ensure that Thread 2 sees the 59th or 60th update from Thread 1. If Thread 1 weren't performing any volatile writes, Thread 2 might see an arbitrarily old value.
This is an extremely narrow edge case, but I think it establishes the need for CopyOnWriteArrayList::set
to perform its volatile write unconditionally.