Consider this example:
#include <iostream>
#include <atomic>
#include <random>
#include <thread>
int need_close(){
random_device rd;
std::mt19937 gen(rd());
uniform_int_distribution<int> distribute(0, 2);
return distribute(gen);
}
void invoke(std::atomic<bool>& is_close){
if(is_close.load(std::memory_order::relaxed)){ // #1
return;
}
auto r = need_close();
if(r==0){
is_close.store(true,std::memory_order::relaxed);
}
}
int main(){
std::atomic<bool> is_close{false};
std::thread t1([&](){
for(auto i = 0; i<100000;i++)
invoke(is_close);
});
std::thread t2([&](){
for(auto i = 0; i<100000;i++)
invoke(is_close);
});
t1.join();
t2.join();
}
In this example, Is the relaxed
ordering sufficient to avoid the call to need_close
once the thread sees that is_close == true
(not immediately, but at some time once the thread can read is_close==true
)? In this example, It seems that I don't need the synchronization to avoid data-race because there is no conflict action in this example. However, from the compiler implementation, the compiler may reorder #1
to any place following it because relaxed
is used here, for example, if #1
is moved to some place after the call point of need_close
, the need_close
will be always called again even though is_close
is set to be true
, which is not expected. So, I wonder whether Acquire/Release
ordering is necessary to avoid compiler reordering the code to make the logic be expected?
memory_order
would prevent this code from callingneed_close()
more than once. Also I suggest replacingrand()
with something else (something thread-safe), if your question is not specifically about it. – Moriyamastd::call_once
(with the necessary synchronisation baked in) – Hardship#1
is useless? – Ghostwriteneed_close()
and pausing before setting the flag, then the second thread enteringinvoke()
and also callingneed_close()
. – Moriyamamemory_order_seq_cst
(sequential consistency) – Rugging#1
is not reordered, theneed_close
won't be called at some time once after the thread readsclose==true
. Instead, if the compiler reorders#1
, theneed_close
still is called even though the thread readsclose==true
, it's the difference here. – Ghostwriteclose==true
and won't call theneed_close
. – Ghostwriteneed_close()
twice by adding a delay after it: gcc.godbolt.org/z/68Kn9Khv8 – Moriyamarelaxed
ordering sufficient to make theneed_close
not be called once the thread seesis_close == true
? If the compiler reordering the code,need_close
can always be called. – Ghostwriteprint
is not called once the thread seesis_close == true
. – Ghostwriteis_close.load()
is true, the early-outreturn
runs, so nothing else in the function happens. This is sequenced before the call toneed_close
. Compilers can't just arbitrarily shuffle source lines (which isn't how they optimize anyway), they have to respect sequencing to not break single-threaded programs. The CPU might fetch and decode the machine code forneed_close()
in the shadow of a mispredicted branch just like in single-threaded code, but the end result has to be as if the code ran in program order withis_close.load()
having produced whatever value. – Xerosisis_close == true
" But this isn't a useful question, is it? By definition when it sees it being true, it won't pass the firstif
. The question is when it sees it. – Moriyamaif
is reordered to some place after the call point ofneed_close()
by the compiler, theneed_close
will keep invoking regardless of when the thread sees theis_close==true
– Ghostwritesequenced-before
principle, the firstif
cannot be reordered to some place after the call site ofneed_close()
even though therelaxed
ordering would permit the compiler to do, right? – Ghostwriteis_close
be a plainbool
. Would you now think that the compiler can move the checkif (is_close)
afterneed_close()
? You certainly wouldn't. That in the posted case the type is different and the procedure runs in a thread does not change anything with regards to this kind of (forbidden) reordering. – Reflectneed_close()
while still waiting for a value foris_close.load(relaxed)
, and as long as it eventually seesfalse
it can confirm this path of execution as the correct one, retiring those instructions. But if not, that speculatively executed work is actually a mis-speculation. The compiler could only reorder things by similarly speculating, like unconditionally doing some computation into a temporary but only actually committing visible side-effects after checking theis_close.load()
result. – Xerosis