If you want to use this in multiple threads, there is one significant gotcha.
While the compiler will not reorder the writes to volatile
variables (as described in the answer by Nate Eldredge), there is one more point where write reordering can occur, and that is the CPU itself. This depends on the CPU architecture, and a few examples follow:
Intel 64
See Intel® 64 Architecture Memory Ordering White Paper.
While the store instructions themselves are not reordered (2.2):
- Stores are not reordered with other stores.
They may be visible to different CPUs in a different order (2.4):
Intel 64 memory ordering allows stores by two processors to be seen in different orders by
those two processors
AMD 64
AMD 64 (which is the common x64) has similar behaviour in the specification:
Generally, out-of-order writes are not allowed. Write instructions executed out of order cannot commit (write) their result to memory until all previous instructions have completed in program order. The processor can, however, hold the result of an out-of-order write instruction in a private buffer (not visible to software) until that result can be committed to memory.
PowerPC
I remember having to be careful about this on Xbox 360 which used a PowerPC CPU:
While the Xbox 360 CPU does not reorder instructions, it does rearrange write operations, which complete after the instructions themselves. This rearranging of writes is specifically allowed by the PowerPC memory model
To avoid CPU reordering in a portable way you need to use memory fences like C++11 std::atomic_thread_fence or C11 atomic_thread_fence. Without them, the order of writes as seen from another thread may be different.
See also C++11 introduced a standardized memory model. What does it mean? And how is it going to affect C++ programming?
This is also noted in the Wikipedia Memory barrier article:
Moreover, it is not guaranteed that volatile reads and writes will be seen in the same order by other processors or cores due to caching, cache coherence protocol and relaxed memory ordering, meaning volatile variables alone may not even work as inter-thread flags or mutexes.
std::atomic
. It has similar non-reordering guarantees. – Baedavolatile std::atomic
types have some counterintuitive behavior, and at least on current compilers. For instance here a load from avolatile std::atomic<int>
is optimized out because its value is unused, even though it wouldn't be for a regularvolatile int
. – Commandostd::atomic
withvolatile
. If op exposes that structure for IO interaction then utilizingvolatile
is unquestionable. However op's tag suggests it's about concurrency (multithreaded program) in which casestd::atomic
is the right tool to use and notvolatile
. Perhaps this is just a loose style of tag naming. – Baedavolatile std::atomic
in the first place anyways. – Claar