Will relaxed memory order lead to infinite loop here?
Asked Answered
S

2

9

Code in question:

#include <atomic>
#include <thread>

std::atomic_bool stop(false);

void wait_on_stop() {
  while (!stop.load(std::memory_order_relaxed));
}

int main() {
  std::thread t(wait_on_stop);
  stop.store(true, std::memory_order_relaxed);
  t.join();
}

Since std::memory_order_relaxed is used here, I assume the compiler is free to reorder stop.store() after t.join(). As a result, t.join() would never return. Is this reasoning correct?

If yes, will changing stop.store(true, std::memory_order_relaxed) to stop.store(true) solve the issue?

Settera answered 22/5, 2018 at 7:5 Comment(5)
These thread operations are optimization barriers, so noCourageous
@PasserBy Could you provide some references? All I know so far is that the completion of t synchronizes with successful return from t.join(), which does not help much.Settera
Since stop is a global variable, I believe the compiler will emit the code for stop.store() before the call to t.join(). In the other hand, I think the processor will be allowed to defer the visibility of the store operation.Tattler
@YannDroneaud Till now, your comment is the only thing that makes sense to me.Settera
Jeff Preshing has written an article on his blog that is related to your question. The answer given by T.C. appears to be correct: atomic stores must become visible within a reasonable time.Peppery
D
6

[intro.progress]/18:

An implementation should ensure that the last value (in modification order) assigned by an atomic or synchronization operation will become visible to all other threads in a finite period of time.

[atomics.order]/12:

Implementations should make atomic stores visible to atomic loads within a reasonable amount of time.

This is a non-binding recommendation. If your implementation follows them - as high-quality implementations should - you are fine. Otherwise, you are screwed. In both cases regardless of the memory order used.


The C++ abstract machine has no concept of "reordering". In the abstract semantics, the main thread stored into the atomic and then blocked, and so if the implementation makes the store visible to loads within a finite amount of time, then the other thread will load this stored value within a finite amount of time and terminate. Conversely, if the implementation doesn't do so for whatever reason, then your other thread will loop forever. The memory order used is irrelevant.

I've never found reasoning about "reordering" to be useful. It mixes up low-level implementation detail with a high-level memory model, and tends to make things more confusing, not less.

Digitalize answered 22/5, 2018 at 8:50 Comment(4)
But the store may not happen at all, if it is reordered after join.Settera
There's no such thing as "reordering" in the abstract machine.Digitalize
Related: Why set the stop flag using `memory_order_seq_cst`, if you check it with `memory_order_relaxed`? has some discussion about the fact that inter-thread latency is a quality-of-implementation issue, C++ just compiles to asm loads and stores; it's hardware cache coherency that gives us low latency.Thao
@Lingxi: That compile-time reordering would violate the as-if rule, creating a deadlock or infinite loop where one didn't exist in the source. In practice on real implementations, the compiler can't see the code for some library functions .join() calls, so it can't be sure it doesn't read your global atomic_bool stop. See How C++ Standard prevents deadlock in spinlock mutex with memory_order_acquire and memory_order_release? for more discussion about what in the standard forbids introducing deadlocks or other infinite loops with static reordering.Thao
W
1

Any function whose definition is not available in the current translation unit is considered an I/O function. Such calls are assumed to cause side effects and the compiler cannot move following statements to precede the call or preceding statements to follow the call.

[intro.execution]:

Reading an object designated by a volatile glvalue ([basic.lval]), modifying an object, calling a library I/O function, or calling a function that does any of those operations are all side effects, which are changes in the state of the execution environment. Evaluation of an expression (or a subexpression) in general includes both value computations (including determining the identity of an object for glvalue evaluation and fetching a value previously assigned to an object for prvalue evaluation) and initiation of side effects. When a call to a library I/O function returns or an access through a volatile glvalue is evaluated the side effect is considered complete, even though some external actions implied by the call (such as the I/O itself) or by the volatile access may not have completed yet.

And

Every value computation and side effect associated with a full-expression is sequenced before every value computation and side effect associated with the next full-expression to be evaluated.

Here std::thread constructor and std::thread::join are such functions (they eventually call platform specific thread functions unavailable in the current TU) with side effects. stop.store also causes side effects (memory store is a side effect). Hence stop.store cannot be moved prior to std::thread constructor or past std::thread::join calls.

Wont answered 22/5, 2018 at 11:21 Comment(2)
Then how to explain this? I guess std::chrono::high_resolution_clock::now() is the kind of function you are talking about.Settera
@Settera There is already a comprehensive answer to that question.Wont

© 2022 - 2024 — McMap. All rights reserved.