A codebase has a COMPILER_BARRIER
macro defined as __asm__ volatile("" ::: "memory")
. The intent of the macro is to prevent the compiler from re-ordering reads and writes across the barrier. Note that this is explicitly a compiler barrier, and not a processor level memory barrier.
As is, this is fairly portable since there are no actual assembly instructions in the AssemblerTemplate, just the volatile
and the memory
clobber. So, as long as the compiler honors GCCs Extended Asm syntax, it should work fine. Still, I'm curious what the right way to express this would be in the C++11 atomics API, if possible.
The following seemed like it might be the right idea: atomic_signal_fence(memory_order_acq_rel);
.
My reasoning being that:
- Of the
<atomic>
APIs, onlyatomic_signal_fence
andatomic_thread_fence
do not need a memory address against which to operate. atomic_thread_fence
affects memory ordering, which we don't need for a compiler barrier.- The
memory
clobber in the Extended Asm version doesn't distinguish between reads and writes, so it would appear that we want both acquire and release semantics, somemory_order_acq_rel
seems to be required, at minimum. memory_order_seq_cst
seems unnecessary, as we don't require a total order across threads - we are only interested in the instruction sequencing within the current thread.
Is it possible to express the equivalent to __asm__ volatile("" ::: "memory")
entirely portably with the C++11 atomics API? If so, is atomic_signal_fence
the correct API to use? If so, what memory order argument is appropriate/required here?
Or, am I off in the weeds here and there is a better way to approach this?
atomic_signal_fence
only guarantees ordering between a thread and a signal handler running in that the same thread. Similarlyatomic_thread_fence
only applies to the ordering between threads. If you're trying to guarantee ordering between two other contexts then neither is portable. For example on Windowsatomic_signal_fence
doesn't need to do anything because Windows doesn't support asynchronous signals. – Dyeratomic_thread_fence
is defined in terms of atomic operations on atomic objects, as defined by the standard. So if you're not using thestd::atomic
types then neither function is guaranteed to work. – Dyer