Is the explanation of relaxed ordering erroneous in cppreference?

Asked 11/1, 2020 at 16:35 Answered 11/1, 2020 at 19:42

c++multithreading c++11 language-lawyer as-if

In the documentation of std::memory_order on cppreference.com there is an example of relaxed ordering:

Relaxed ordering

Atomic operations tagged memory_order_relaxed are not synchronization operations; they do not impose an order among concurrent memory accesses. They only guarantee atomicity and modification order consistency.

For example, with x and y initially zero,
// Thread 1:
r1 = y.load(std::memory_order_relaxed); // A
x.store(r1, std::memory_order_relaxed); // B
// Thread 2:
r2 = x.load(std::memory_order_relaxed); // C
y.store(42, std::memory_order_relaxed); // D
is allowed to produce r1 == r2 == 42 because, although A is sequenced-before B within thread 1 and C is sequenced before D within thread 2, nothing prevents D from appearing before A in the modification order of y, and B from appearing before C in the modification order of x. The side-effect of D on y could be visible to the load A in thread 1 while the side effect of B on x could be visible to the load C in thread 2. In particular, this may occur if D is completed before C in thread 2, either due to compiler reordering or at runtime.

it says "C is sequenced before D within thread 2".

According to the definition of sequenced-before, which can be found in Order of evaluation, if A is sequenced before B, then evaluation of A will be completed before evaluation of B begins. Since C is sequenced before D within thread 2, C must be completed before D begins, hence the condition part of the last sentence of the snapshot will never be satisfied.

Moser answered 11/1, 2020 at 16:35 Comment(3)

Is your question specifically about C++11? – Sapir 13/1, 2020 at 2:28

no, it also applies to c++14,17. I know both the compiler and CPU may reorder C with D.But if the reordering occurs,C can't be completed before D begins. So I think there is a terminology-misuse in the sentence " A is sequenced-before B within thread 1 and C is sequenced before D within thread 2". It's more accurate to say "In the code,A is PLACED BEFORE B within thread 1 and C is PLACED BEFORE D within thread 2". The aim of this question is to confirm this thought – Moser 13/1, 2020 at 3:18

Nothing is defined in term of "reordering". – Sapir 13/1, 2020 at 8:36

I believe cppreference is right. I think this boils down to the "as-if" rule [intro.execution]/1. Compilers are only bound to reproduce the observable behavior of the program described by your code. A sequenced-before relation is only established between evaluations from the perspective of the thread in which these evaluations are performed [intro.execution]/15. That means when two evaluations sequenced one after the other appear somewhere in some thread, the code actually running in that thread must behave as if whatever the first evaluation does did indeed affect whatever the second evaluation does. For example

int x = 0;
x = 42;
std::cout << x;

must print 42. However, the compiler doesn't actually have to store the value 42 into an object x before reading the value back from that object to print it. It may as well remember that the last value to be stored in x was 42 and then simply print the value 42 directly before doing an actual store of the value 42 to x. In fact, if x is a local variable, it may as well just track what value that variable was last assigned at any point and never even create an object or actually store the value 42. There is no way for the thread to tell the difference. The behavior is always going to be as if there was a variable and as if the value 42 were actually stored in an object x before being loaded from that object. But that doesn't mean that the generated machine code has to actually store and load anything anywhere ever. All that is required is that the observable behavior of the generated machine code is indistinguishable from what the behavior would be if all these things were to actually happen.

If we look at

r2 = x.load(std::memory_order_relaxed); // C
y.store(42, std::memory_order_relaxed); // D

then yes, C is sequenced before D. But when viewed from this thread in isolation, nothing that C does affects the outcome of D. And nothing that D does would change the result of C. The only way one could affect the other would be as an indirect consequence of something happening in another thread. However, by specifying std::memory_order_relaxed, you explicitly stated that the order in which the load and store are observed by another thread is irrelevant. Since no other thread can observe the load and store in any particular order, there is nothing another thread could do to make C and D affect each other in a consistent manner. Thus, the order in which the load and store are actually performed is irrelevant. Thus, the compiler is free to reorder them. And, as mentioned in the explanation underneath that example, if the store from D is performed before the load from C, then r1 == r2 == 42 can indeed come about…

Ligroin answered 11/1, 2020 at 17:56 Comment(17)

So essentially the standard states that C must happen before D, but compiler believes that it can't be proven whether C or D happened next and, due to the as-if rule, reorders them anyway, correct? – Oistrakh 11/1, 2020 at 18:17

@Oistrakh No. C must happen before D as far as the thread on which they happen can tell. Observing from another context may be inconsistent with that view. – Llamas 11/1, 2020 at 19:5

There is no "as if rule" – Sapir 12/1, 2020 at 2:12

@Sapir This claim seems similar to your other, previous, C++ evangelisms. – Jerid 12/1, 2020 at 4:5

Yes,compiler can reorder AB in thread 1 and CD in thread 2. Even if compiler didn't perform the reordering, some CPU may reorder AB and CD at runtime. However, if such reordering did occur, C wont't be completed before D begins, A wont't be completed before B begins. So there is a teminology-misuse in the sentence " A is sequenced-before B within thread 1 and C is sequenced before D within thread 2". It is better to say "In the source code, A is PLACED BEFORE B within thread 1 and C is PLACED BEFORE D within thread 2",right? – Moser 12/1, 2020 at 5:11

@LightnessRacesBY-SA3.0 Can you show what the (non existant) as if rule allows that wouldn't be allowed otherwise? – Sapir 12/1, 2020 at 9:52

@Sapir Michael has posted a long explanation of it along with links to the relevant chapters in the standard. – Jerid 12/1, 2020 at 14:29

@Moser Sequenced-before is a rule for a single thread. How, if, and when updates from one thread conflict with / are seen by other threads is a different chapter. But which order it is written down in the source-code is generally not interesting, unless and only insofar as it has an effect on that. – Llamas 13/1, 2020 at 2:35

@Moser The "sequenced before" relation is used to describe the behavior of the abstract machine defined by the standard. So in the abstract machine, C won't be completed before D begins. However, in some implementation, C may be completed before D begins, as long as the implementation and the abstract machine have the same observable behaviors. – Ruthi 13/1, 2020 at 9:17

@Sapir The standard does label one of it's provisions "the as-if rule" in a footnote: "This provision is sometimes called the “as-if” rule" intro.execution – Crossarm 13/1, 2020 at 14:36

@LightnessRacesBY-SA3.0 "Michael has posted a long explanation of it" A long explanation that starts with (the so called, non existant) "as if rule" which only says that the compiler must produce code that works and concludes that the compiler can produce code that fails. I have never read something as absurd as that claim. – Sapir 15/1, 2020 at 21:46

@Crossarm I don't know what "provision" means exactly, but the so called "as if rule" isn't a rule and doesn't do anything. It has no implication. It's a general observation, that was meant to make things clearer, but doesn't. You can't justify any specific compiler behavior based on the as if rule. Saying "according to the as if rule, blah blah blah" has as much weight as "according to correctness and good behavior, blah blah blah". It's a meaningless intro and any answer that cites it is probably wrong, and its author probably has no understanding of PL semantics. – Sapir 16/1, 2020 at 23:59

@Moser The source code is only useful to define traces of execution. Sequence before is a property of an execution. The intent is that what happens to be seen sequentially in one thread doesn't have to be seen that way in another, but as usual for the basic stuff, the std committee neglected to express its intent in words. – Sapir 17/1, 2020 at 0:2

"Since no other thread can observe the load and store in any particular order" I can't parse that sentence. What "order" are you talking about? – Sapir 17/1, 2020 at 0:6

@Sapir I quoted the footnote that names "The semantic descriptions in this International Standard define a parameterized nondeterministic abstract machine. This International Standard places no requirement on the structure of conforming implementations. In particular, they need not copy or emulate the structure of the abstract machine. Rather, conforming implementations are required to emulate (only) the observable behavior of the abstract machine as explained below." the as-if rule. – Crossarm 17/1, 2020 at 9:22

@Sapir and then "Certain other aspects and operations of the abstract machine are described in this International Standard as unspecified (for example, order of evaluation of arguments to a function). Where possible, this International Standard defines a set of allowable behaviors. These define the nondeterministic aspects of the abstract machine. An instance of the abstract machine can thus have more than one possible execution for a given program and a given input." There are cases where two implementations can get different results from the same program and both conform to the spec. – Crossarm 17/1, 2020 at 9:24

@Crossarm "This International Standard places no requirement on the structure of conforming implementations." is a description of the rest of the std. Either way, the way PL are usually specified (not in term of translation to asm or steps on a "virtual machine"), there is no way around it: only the end result is specified. "There are cases where two implementations can get different results from the same program" That's trivially true when you measure an impl defined quantity. – Sapir 17/1, 2020 at 17:34

It is sometimes possible for an action to be ordered relative to two other sequences of actions, without implying any relative ordering of the actions in those sequences relative to each other.

Suppose, for example, that one has the following three events:

store 1 to p1
load p2 into temp
store 2 to p3

and the read of p2 is independently ordered after the write of p1 and before the write of p3, but there is no particular ordering in which both p1 and p3 partake. Depending upon what is done with p2, it may be impractical for a compiler to defer p1 past p3 and still achieve the required semantics with p2. Suppose, however, the compiler knew that the above code was part of a larger sequence:

store 1 to p2 [sequenced before the load of p2]
[do the above]
store 3 into p1 [sequenced after the other store to p1]

In that case, it could determine that it could reorder the store to p1 after the above code and consolidate it with the following store, thus resulting in code that writes p3 without writing p1 first:

set temp to 1
store temp to p2
store 2 to p3
store 3 to p1

Although it may seem that data dependencies would cause certain parts of sequencing relations to behave transitively, a compiler may identify situations where apparent data dependencies don't exist, and would thus not have the transitive effects one would expect.

Switchman answered 11/1, 2020 at 18:16 Comment(0)

If there are two statements, the compiler will generate code in sequential order so code for the first one will be placed prior to the second one. But cpus internally have pipelines and are able to run assembly operations in parallel. Statement C is a load instruction. While memory is being fetched the pipeline will process the next few instructions and given they are not dependent on the load instruction they could end up being executed prior to C being finished (e.g. data for D was in cache, C in main memory).

If the user really needed the two statements to be executed sequentially, stricter memory ordering operations can be used. In general users don't care as long as the program is logically correct.

Culp answered 11/1, 2020 at 19:42 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Relaxed ordering

Recommended topics

Hot tags