When we talk about memory access on modern architectures, we usually ignore the "exact location" the value is read from.
A read operation can fetch data from the cache (L0/L1/...), the RAM or even the hard-drive (e.g. when the memory is swapped).
These keywords tell the compiler which assembly operations to use when accessing the data.
volatile
A keyword that tells the compiler to always read the variable's value from memory, and never from the register.
This "memory" can still be the cache, but, in case that this "address" in the cache is considered "dirty", meaning that the value has changed by a different processor, the value will be reloaded.
This ensures we never read a stale value.
Clarification: According to the standard, if the volatile
type is not a primitive, whose read/write operations are atomic (in regard to the assembly instructions that read/write it) by nature, one might possibly read an intermediate value (the writer managed to write only half of the bytes by the time the reader read it). However, modern implementations do not behave this way.
atomic
When the compiler sees a load
(read) operation, it basically does the exact same thing it would have done for a volatile
value.
So, what is the difference???
The difference is cross-CPU write operations.
When working with a volatile variable, if CPU 1 sets the value, and CPU 2 reads it, the reader might read an old value.
But, how can that be? The volatile keyword promises that we won't read a stale value!
Well, that's because the writer didn't publish the value! And though the reader tries to read it, it reads the old one.
When the compiler stumbles upon a store
(write) operation for an atomic variable it:
- Sets the value atomically in memory
- Announces that the value has changed
After the announcement, all the CPUs will know that they should re-read the value of the variable because their caches will be marked "dirty".
This mechanism is very similar to operations performed on files. When your application writes to a file on the hard-drive, other applications may or may not see the new information, depending on whether or not your application flushed the data to the hard-drive.
If the data wasn't flushed, then it merely resides somewhere in your application's caches and visible only itself. Once you flush it, anyone who opens the file will see the new state.
Clarification: Common modern compiler & cache implementations ensure correct publishing of volatile
writes as well. However, this is NOT a reason to prefer that over std::atomic
. For example, just like some comments pointed out, Linux's atomic read and writes for x86_64 are implemented using volatiles
.
sleep()
in one thread (that is a common way to highlight races) – Asperges