Will two relaxed writes to the same location in different threads always be seen in the same order by other threads?
Asked Answered
E

3

16

On the x86 architecture, stores to the same memory location have a total order, e.g., see this video. What are the guarantees in the C++11 memory model?

More precisely, in

-- Initially --
std::atomic<int> x{0};

-- Thread 1 --
x.store(1, std::memory_order_release);

-- Thread 2 --
x.store(2, std::memory_order_release);

-- Thread 3 --
int r1 = x.load(std::memory_order_acquire);
int r2 = x.load(std::memory_order_acquire);

-- Thread 4 --
int r3 = x.load(std::memory_order_acquire);
int r4 = x.load(std::memory_order_acquire);

would the outcome r1==1, r2==2, r3==2, r4==1 be allowed (on some architecture other than x86)? What if I were to replace all memory_order's by std::memory_order_relaxed?

Educatory answered 6/12, 2014 at 15:43 Comment(1)
Related followup with independent writes to two different locations: Will two atomic writes to different locations in different threads always be seen in the same order by other threads?.Utter
F
12

No, such an outcome is not allowed. §1.10 [intro.multithread]/p8, 18 (quoting N3936/C++14; the same text is found in paragraphs 6 and 16 for N3337/C++11):

8 All modifications to a particular atomic object M occur in some particular total order, called the modification order of M.

18 If a value computation A of an atomic object M happens before a value computation B of M, and A takes its value from a side effect X on M, then the value computed by B shall either be the value stored by X or the value stored by a side effect Y on M, where Y follows X in the modification order of M. [ Note: This requirement is known as read-read coherence. —end note ]

In your code there are two side effects, and by p8 they occur in some particular total order. In Thread 3, the value computation to calculate the value to be stored in r1 happens before that of r2, so given r1 == 1 and r2 == 2 we know that the store performed by Thread 1 precedes the store performed by Thread 2 in the modification order of x. That being the case, Thread 4 cannot observe r3 == 2, r4 == 1 without running afoul of p18. This is regardless of the memory_order used.

There is a note in p21 (p19 in N3337) that is relevant:

[ Note: The four preceding coherence requirements effectively disallow compiler reordering of atomic operations to a single object, even if both operations are relaxed loads. This effectively makes the cache coherence guarantee provided by most hardware available to C++ atomic operations. —end note ]

Footpace answered 6/12, 2014 at 16:19 Comment(4)
Can you help me understand p18? Is value computation synonym with load-from-atomic and side effect synonym with store-to-atomic?Educatory
@TobiasBrüll The load is a value computation; the store is a side effect.Footpace
What other kind of value computations do exist? And what other kind of side effects?Educatory
@TobiasBrüll On (non-volatile-qualified) atomic objects? I can't really think of any, at least as relevant to p18. In general, value computations include both determining the identity of the object and fetching the stored value; side effects includes accessing an object via a volatile glvalue, modifying an object, calling a library I/O function, or calling a function that does any of the above ([intro.execution]/p12).Footpace
D
7

Per C++11 [intro.multithread]/6: "All modifications to a particular atomic object M occur in some particular total order, called the modification order of M." Consequently, reads of an atomic object by a particular thread will never see "older" values than those the thread has already observed. Note that there is no mention of memory orderings here, so this property holds true for all of them - seq_cst through relaxed.

In the example given in the OP, the modification order of x can be either (0,1,2) or (0,2,1). A thread that has observed a given value in that modification order cannot later observe an earlier value. The outcome r1==1, r2==2 implies that the modification order of x is (0,1,2), but r3==2, r4==1 implies it is (0,2,1), a contradiction. So that outcome is not possible on an implementation that conforms to C++11 .

Daryn answered 6/12, 2014 at 16:20 Comment(0)
U
1

Given that the C++11 rules definitely disallow this, here's a more qualitative / intuitive way to understand it:

If there are no further stores to x, eventually all readers will agree on its value. (i.e. one of the two stores came 2nd).

If it were possible for different threads to disagree about the order, then either they'd permanently / long-term disagree about the value, or one thread could see the value change a 3rd extra time (a phantom store).

Fortunately C++11 doesn't allow either of those possibilities.

Utter answered 3/6, 2018 at 17:46 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.