Consider this code:
std::atomic<int> x{ 0 };
std::atomic<int> y{ 0 };
int a;
int b;
void thread1()
{
//atomic op A
x.store(1, std::memory_order_relaxed);
//fence X
std::atomic_thread_fence(std::memory_order_seq_cst);
//sequenced-before P, thus in SC order X=>P
//atomic op P
a = y.load(std::memory_order_seq_cst);//0
//reads-before(from-read) Q, thus in SC order P=>Q
}
void thread2()
{
//atomic op Q
y.store(1, std::memory_order_seq_cst);
//sequenced-before B, thus in SC order Q=>B
//atomic op B
b = x.load(std::memory_order_seq_cst);
}
int main()
{
std::thread t2(thread2);
std::thread t1(thread1);
t1.join();
t2.join();
assert(a == 1 || b == 1);//true?
return 0;
}
The question is: is the assertion a == 1 || b == 1
always true in C++20?
I think it's true in C++17. The idea is this: first assume a get 0, then prove b must get 1. Let's split it into two parts:
1. Fence X precedes atomic op B in the SC total order
- X is sequenced-before P, thus X=>P
- P reads 0 and Q writes 1, thus P reads-before(from-read) Q, thus P=>Q
- I don't find the clause in the standard that guarantees this, but this paper said so. Please point out if I get it wrong. Edit: Now I know. It's the same rule about coherence-ordered before, as in Nate's answer.
- Q is sequenced-before B, thus Q=>B
2. Atomic op B reads the value(1) written by atomic op A
C++17 has this in the standard:
For atomic operations A and B on an atomic object M, where A modifies M and B takes its value, if there is a memory_order::seq_cst fence X such that A is sequenced before X and B follows X in S, then B observes either the effects of A or a later modification of M in its modification order.
It's exactly the case here. Op A and B are on atomic object x, where A stores 1 in x and B reads x's value; and there is the seq_cst fence X; A is sequenced before X and B follows X in S(the above); and there's no modification of M later than A. So B observes the value written by A. So b gets 1.
Is the above reasoning correct?
But in C++20, the above sited text about fences is removed (by P0668), and upgraded to "strengthened" seq_cst fences, which reads:
An atomic operation A on some atomic object M is coherence-ordered before another atomic operation B on M if
- A is a modification, and B reads the value stored by A, or
- A precedes B in the modification order of M, or
- A and B are not the same atomic read-modify-write operation, and there exists an atomic modification X of M such that A reads the value stored by X and X precedes B in the modification order of M, or
- there exists an atomic modification X of M such that A is coherence-ordered before X and X is coherence-ordered before B.
There is a single total order S on all memory_order::seq_cst operations, including fences, that satisfies the following constraints. First, if A and B are memory_order::seq_cst operations and A strongly happens before B, then A precedes B in S. Second, for every pair of atomic operations A and B on an object M, where A is coherence-ordered before B, the following four conditions are required to be satisfied by S:
- if A and B are both memory_order::seq_cst operations, then A precedes B in S; and
- if A is a memory_order::seq_cst operation and B happens before a memory_order::seq_cst fence Y , then A precedes Y in S; and
- if a memory_order::seq_cst fence X happens before A and B is a memory_order::seq_cst operation, then X precedes B in S; and
- if a memory_order::seq_cst fence X happens before A and B happens before a memory_order::seq_- cst fence Y , then X precedes Y in S.
I don't see how a==1 || b==1
is guaranteed any more. It's only about how SC fences are ordered with regard to other SC operations, or operations that are coherence-ordered with each other.
I can't find anything else about how SC fences interact with non-SC atomic operations in the C++20 standard, and a load that does not see a value from a store doesn't mean it's coherence-ordered before the store. (Or does it? If so that would fix the problem; see comments) 31.4 : 3.2 applies when A precedes B in the modification order of M, but reads aren't part of the modification order of an object, are they?
Is it I didn't look into the standard hard enough, or the code simply doesn't work any more in C++20? If it's the later, is the regression intended? (In P0668, they claim to "strengthen" fences).
a == 1 || b == 1
is guaranteed. Relaxed store + seq_cst barrier is at least as strong as a seq_cst store, and any interleaving of source order has at least one load after the store in the other thread. I don't know about C++20 yet, haven't read the details. I expect it's still something the committee wants to be true, so if there is a change that might be a defect. (But I don't expect real-world implementations would change how they compile it, so in practice everything's fine on normal CPUs, of course.) – Jubbaha==0 && b==0
then say B(the load of x) is coherence-ordered before A(the store of x). Then applying the second bullet list : "- if A is a memory_order::seq_cst operation and B happens before a memory_order::seq_cst fence Y , then A precedes Y in S;", you get B=>X in S. That's a circle in S (X=>P=>Q=>B=>X), which is prohibited. Soa==0 && b==0
cannot be true. – Sturdy