Even for a simple 2-thread communication example, I have difficulty to express this in the C11 atomic and memory_fence style to obtain proper memory ordering:
shared data:
volatile int flag, bucket;
producer thread:
while (true) {
int value = producer_work();
while (atomic_load_explicit(&flag, memory_order_acquire))
; // busy wait
bucket = value;
atomic_store_explicit(&flag, 1, memory_order_release);
}
consumer thread:
while (true) {
while (!atomic_load_explicit(&flag, memory_order_acquire))
; // busy wait
int data = bucket;
atomic_thread_fence(/* memory_order ??? */);
atomic_store_explicit(&flag, 0, memory_order_release);
consumer_work(data);
}
As far as I understand, above code would properly order the store-in-bucket -> flag-store -> flag-load -> load-from-bucket. However, I think that there remains a race condition between load-from-bucket and re-write the bucket again with new data. To force an order following the bucket-read, I guess I would need an explicit atomic_thread_fence()
between the bucket read and the following atomic_store. Unfortunately, there seems to be no memory_order
argument to enforce anything on preceding loads, not even the memory_order_seq_cst
.
A really dirty solution could be to re-assign bucket
in the consumer thread with a dummy value: that contradicts the consumer read-only notion.
In the older C99/GCC world I could use the traditional __sync_synchronize()
which I believe would be strong enough.
What would be the nicer C11-style solution to synchronize this so-called anti-dependency?
(Of course I am aware that I should better avoid such low-level coding and use available higher-level constructs, but I would like to understand...)
atomic_thread_fence()
call is necessary. The flag update has release semantics, preventing any preceding store instructions from being reordered across it (e.g., the store todata
). The store todata
has a dependency on the read frombucket
, so that read cannot be reordered past the flag release either. If the full fence is necessary, I'd love to hear why. – Morionatomic_flag
data type, that implements exactly this semantic, but which eventually has more direct implementation in hardware.atomic_flag
is the only atomic data type that is guaranteed to be lock-free, so this is always to be preferred over more complex operations. And it definitively wouldn't need an extra fence to ensure consistency. – Mortalitybucket
could not be reordered past the write toflag
, regardless of where the read value goes. Also, from Herb Sutter: "A write-release executes after all reads and writes by the same thread that precede it in program order." – Morion