I find it much easier to understand atomics with some knowledge of what might be causing them, so here's some background knowledge. Know that these concepts are in no way stated in the C++ language itself, but is some of the possible reasons why things are the way they are.
Compiler reordering
Compilers, often when optimizing, will choose to refactor the program as long as its effects are the same on a single threaded program. This is circumvented with the use of atomics, which will tell the compiler (among other things) that the variable might change at any moment, and that its value might be read elsewhere.
Formally, atomics ensures one thing: there will be no data races. That is, accessing the variable will not make your computer explode.
CPU reordering
CPU might reorder instructions as it is executing them, which means the instructions might get reordered on the hardware level, independent of how you wrote the program.
Caching
Finally there are effects of caches, which are faster memory that sorta contains a partial copy of the global memory. Caches are not always in sync, meaning they don't always agree on what is "correct". Different threads may not be using the same cache, and due to this, they may not agree on what values variables have.
Back to the problem
What the above surmounts to is pretty much what C++ says about the matter: unless explicitly stated otherwise, the ordering of side effects of each instruction is totally and completely unspecified. It might not even be the same viewed from different threads.
Formally, the guarantee of an ordering of side effects is called a happens-before relation. Unless a side effect happens-before another, it is not. Loosely, we just say call it synchronization.
Now, what is memory_order_relaxed
? It is telling the compiler to stop meddling, but don't worry about how the CPU and cache (and possibly other things) behave. Therefore, one possibility of why you see the "impossible" assert might be
- Thread 1 stores
20
to y
and then 10
to x
to its cache.
- Thread 2 reads the new values and stores
10
to y
to its cache.
- Thread 3 didn't read the values from thread 1, but reads those of thread 2, and then the assert fails.
This might be completely different from what happens in reality, the point is anything can happen.
To ensure a happens-before relation between the multiple reads and writes, see Brian's answer.
Another construct that provides happens-before relations is std::mutex
, which is why they are free from such insanities.
memory_order_relaxed
removed the possibility of data race, but in other aspects, it is the same as when they are not atomic. – Loftx
had been loaded with another value before, and the cache line of thread 3 wherex
rests still hasn't been updated since10
was stored inx
, buty
has been updated properly. By asking for a relaxed memory order, you're basically saying you don't care if it's the right value, only that it doesn't race – Urology