Semantics of volatile std::atomic<T>
Asked Answered
S

0

12

Generally std::atomic<T> does not imply semantics of volatile, i.e. operations on the atomic object are not observable side effects that the compiler needs to preserve.

As a consequence the compiler can optimize e.g.

void f(std::atomic<int>& x) {
    x.fetch_add(0, std::memory_order_relaxed);
}

to a no-op. And indeed Clang does perform this optimization, while GCC and MSVC do not.

I think it is expected that std::atomic and volatile semantics can be combined by qualifying std::atomic<T> with volatile, i.e. my expectation is that, assuming std::atomic<int> is always lock-free,

void f(volatile std::atomic<int>& x) {
    x.fetch_add(0, std::memory_order_relaxed);
}

forces the compiler to emit an instruction in f that implements a read-modify-write operation on x.

But interestingly Clang only behaves partially in the way I would expect it to. x86-64 Clang 18.1 -O2 emits:

f(std::atomic<int> volatile&):
        mfence
        mov     eax, dword ptr [rdi]
        ret

There is a load instruction, but no store to the memory. In particular this is not an atomic RMW. While this may simply be a bug, I do wonder what guarantees one should be able to expect on volatile std::atomic access from the standard, especially for RMW operations.


std::atomic specifically has volatile-qualified member functions, showing that volatile-qualified std::atomics are intended for use. However, I have trouble finding anything specifying the semantics of the volatile overloads.

In particular only access to objects through volatile glvalues have an observable side effect. Calling a member function on std::atomic<T> is however not an access and it isn't clear to me whether/how the volatile qualification on the member function is supposed to translate to the atomic operation semantics specified for each function, because these are not specified in terms of read/writes on glvalues.

Even if I assume that an atomic load corresponds to a read of a scalar value and an atomic store corresponds to a write of a scalar value and that in all cases of volatile-qualified member functions the operation is supposed to be analogues to the corresponding access on a volatile-qualified glvalue, then there is still a problem with RMW operations, because "access" can only be a read or write.


Does volatile-qualification of std::atomic have the effect I expect for simple loads and stores? How should it behave for RMW operations and is Clang's behavior correct?


I posted an analogous question regarding C here.

Stopple answered 22/7, 2024 at 2:16 Comment(4)
Maybe you should not add 0 it will allow for optimizations to be applied, since adding 0 will not do anything (that might explain why you don't see any strore). It is at least interesting to see the differences in code being generated when you either add 1 or 2.Protectionist
@PepijnKramer I used 0 specifically because it is guaranteed to have no effect other than the volatile access. In principle I could have used any other set of optimizable atomic operations, e.g. a store followed by a redundant store with the same value, but my example is one of the few cases where an actual compiler does an optimization on atomic access. In more complex scenarios compilers behave de facto as if volatile semantics were always present on atomic access, even if the standard doesn't mandate that.Stopple
Seems like a clang bug to me, at least in terms of quality-of-implementation even if C++ technically allows this, perhaps with its optimizations for atomic objects failing to check for volatile. Ops on volatile objects shouldn't be optimized away; that's the expectation at least. If that was an MMIO register, not touching it would be a bug, and using volatile atomic is how I'd expect to be able to describe it to the compiler.Premiership
One of the original uses of volatile was to allow device drivers to be written. For devices, a memory location is not necessarily "memory". Reading a value and writing the same value back might perform a necessary hardware function. Volatile was supposed to prevent optimizations like this.Leftwards

© 2022 - 2025 — McMap. All rights reserved.