How are memory_order_seq_cst fences useful anymore in C++20?

std::atomic<int> x{ 0 }; std::atomic<int> y{ 0 }; int a; int b; void thread1() { //atomic op A x.store(1, std::memory_order_relaxed); //fence X std::atomic_thread_fence(std::memory_order_seq_cst); //sequenced-before P, thus in SC order X=>P //atomic op P a = y.load(std::memory_order_seq_cst);//0 //reads-before(from-read) Q, thus in SC order P=>Q } void thread2() { //atomic op Q y.store(1, std::memory_order_seq_cst); //sequenced-before B, thus in SC order Q=>B //atomic op B b = x.load(std::memory_order_seq_cst); } int main() { std::thread t2(thread2); std::thread t1(thread1); t1.join(); t2.join(); assert(a == 1 || b == 1);//true? return 0; }

Yes, I think we can prove that a == 1 || b == 1 is always true. Most of the ideas here were worked out in comments by zwhconst and Peter Cordes, so I just thought I would write it up as an exercise.

(Note that X, Y, A, B below are used as the dummy variables in the standard's axioms, and may change from line to line. They do not coincide with the labels in your code.)

Suppose b = x.load() in thread2 yields 0.

We do have the coherence ordering that you asked about. Specifically, if b = x.load yields 0, then I claim that x.load() in thread2 is coherence ordered before x.store(1) in thread1, thanks to the third bullet in the definition of coherence ordering. For let A be x.load(), B be x.store(1), and X be the initialization x{0} (see below for quibble). Clearly X precedes B in the modification order of x, since X happens-before B (synchronization occurs when the thread is started), and if b == 0 then A has read the value stored by X.

(There is possibly a gap here: initialization of an atomic object is not an atomic operation (3.18.1p3), so as worded, the coherence ordering does not apply to it. I have to believe it was intended to apply here, though. Anyway, we could dodge the issue by putting x.store(0, std::memory_order_relaxed); in main before starting the threads, which would still address the spirit of your question.)

Now in the definition of the ordering S, apply the second bullet with A = x.load() and B = x.store(1) as before, and Y being the atomic_thread_fence in thread1. Then A is coherence-ordered before B, as we just showed; A is seq_cst; and B happens-before Y by sequencing. So therefore A = x.load() precedes Y = fence in the order S.

Now suppose a = y.load() in thread1 also yields 0.

By a similar argument to before, y.load() is coherence ordered before y.store(1), and they are both seq_cst, so y.load() precedes y.store(1) in S. Also, y.store(1) precedes x.load() in S by sequencing, and likewise atomic_thread_fence precedes y.load() in S. We therefore have in S:

x.load precedes fence precedes y.load precedes y.store precedes x.load

which is a cycle, contradicting the strict ordering of S.

1. Fence X precedes atomic op B in the SC total order

2. Atomic op B reads the value(1) written by atomic op A

Recommended topics

Hot tags