Generally std::atomic<T>
does not imply semantics of volatile
, i.e. operations on the atomic object are not observable side effects that the compiler needs to preserve.
As a consequence the compiler can optimize e.g.
void f(std::atomic<int>& x) {
x.fetch_add(0, std::memory_order_relaxed);
}
to a no-op. And indeed Clang does perform this optimization, while GCC and MSVC do not.
I think it is expected that std::atomic
and volatile
semantics can be combined by qualifying std::atomic<T>
with volatile
, i.e. my expectation is that, assuming std::atomic<int>
is always lock-free,
void f(volatile std::atomic<int>& x) {
x.fetch_add(0, std::memory_order_relaxed);
}
forces the compiler to emit an instruction in f
that implements a read-modify-write operation on x
.
But interestingly Clang only behaves partially in the way I would expect it to. x86-64 Clang 18.1 -O2 emits:
f(std::atomic<int> volatile&):
mfence
mov eax, dword ptr [rdi]
ret
There is a load instruction, but no store to the memory. In particular this is not an atomic RMW. While this may simply be a bug, I do wonder what guarantees one should be able to expect on volatile std::atomic
access from the standard, especially for RMW operations.
std::atomic
specifically has volatile
-qualified member functions, showing that volatile
-qualified std::atomic
s are intended for use. However, I have trouble finding anything specifying the semantics of the volatile
overloads.
In particular only access to objects through volatile
glvalues have an observable side effect. Calling a member function on std::atomic<T>
is however not an access and it isn't clear to me whether/how the volatile
qualification on the member function is supposed to translate to the atomic operation semantics specified for each function, because these are not specified in terms of read/writes on glvalues.
Even if I assume that an atomic load corresponds to a read of a scalar value and an atomic store corresponds to a write of a scalar value and that in all cases of volatile
-qualified member functions the operation is supposed to be analogues to the corresponding access on a volatile
-qualified glvalue, then there is still a problem with RMW operations, because "access" can only be a read or write.
Does volatile
-qualification of std::atomic
have the effect I expect for simple loads and stores? How should it behave for RMW operations and is Clang's behavior correct?
I posted an analogous question regarding C here.
0
it will allow for optimizations to be applied, since adding 0 will not do anything (that might explain why you don't see any strore). It is at least interesting to see the differences in code being generated when you either add 1 or 2. – Protectionist0
specifically because it is guaranteed to have no effect other than thevolatile
access. In principle I could have used any other set of optimizable atomic operations, e.g. a store followed by a redundant store with the same value, but my example is one of the few cases where an actual compiler does an optimization on atomic access. In more complex scenarios compilers behave de facto as ifvolatile
semantics were always present on atomic access, even if the standard doesn't mandate that. – Stoppleatomic
objects failing to check forvolatile
. Ops onvolatile
objects shouldn't be optimized away; that's the expectation at least. If that was an MMIO register, not touching it would be a bug, and usingvolatile atomic
is how I'd expect to be able to describe it to the compiler. – Premiership