Will relaxed memory order lead to infinite loop here?

Asked 22/5, 2018 at 7:5 Answered 22/5, 2018 at 11:21

Solved c++c++11 concurrency atomic memory-model

Code in question:

#include <atomic>
#include <thread>

std::atomic_bool stop(false);

void wait_on_stop() {
  while (!stop.load(std::memory_order_relaxed));
}

int main() {
  std::thread t(wait_on_stop);
  stop.store(true, std::memory_order_relaxed);
  t.join();
}

Since std::memory_order_relaxed is used here, I assume the compiler is free to reorder stop.store() after t.join(). As a result, t.join() would never return. Is this reasoning correct?

If yes, will changing stop.store(true, std::memory_order_relaxed) to stop.store(true) solve the issue?

Settera answered 22/5, 2018 at 7:5 Comment(5)

These thread operations are optimization barriers, so no – Courageous 22/5, 2018 at 7:56

@PasserBy Could you provide some references? All I know so far is that the completion of t synchronizes with successful return from t.join(), which does not help much. – Settera 22/5, 2018 at 8:0

Since stop is a global variable, I believe the compiler will emit the code for stop.store() before the call to t.join(). In the other hand, I think the processor will be allowed to defer the visibility of the store operation. – Tattler 22/5, 2018 at 8:23

@YannDroneaud Till now, your comment is the only thing that makes sense to me. – Settera 22/5, 2018 at 11:37

Jeff Preshing has written an article on his blog that is related to your question. The answer given by T.C. appears to be correct: atomic stores must become visible within a reasonable time. – Peppery 22/5, 2018 at 14:7

[intro.progress]/18:

An implementation should ensure that the last value (in modification order) assigned by an atomic or synchronization operation will become visible to all other threads in a finite period of time.

[atomics.order]/12:

Implementations should make atomic stores visible to atomic loads within a reasonable amount of time.

This is a non-binding recommendation. If your implementation follows them - as high-quality implementations should - you are fine. Otherwise, you are screwed. In both cases regardless of the memory order used.

The C++ abstract machine has no concept of "reordering". In the abstract semantics, the main thread stored into the atomic and then blocked, and so if the implementation makes the store visible to loads within a finite amount of time, then the other thread will load this stored value within a finite amount of time and terminate. Conversely, if the implementation doesn't do so for whatever reason, then your other thread will loop forever. The memory order used is irrelevant.

I've never found reasoning about "reordering" to be useful. It mixes up low-level implementation detail with a high-level memory model, and tends to make things more confusing, not less.

Digitalize answered 22/5, 2018 at 8:50 Comment(4)

But the store may not happen at all, if it is reordered after join. – Settera 22/5, 2018 at 8:58

There's no such thing as "reordering" in the abstract machine. – Digitalize 22/5, 2018 at 9:46

Related: Why set the stop flag using `memory_order_seq_cst`, if you check it with `memory_order_relaxed`? has some discussion about the fact that inter-thread latency is a quality-of-implementation issue, C++ just compiles to asm loads and stores; it's hardware cache coherency that gives us low latency. – Thao 24/11, 2022 at 11:2

@Lingxi: That compile-time reordering would violate the as-if rule, creating a deadlock or infinite loop where one didn't exist in the source. In practice on real implementations, the compiler can't see the code for some library functions .join() calls, so it can't be sure it doesn't read your global atomic_bool stop. See How C++ Standard prevents deadlock in spinlock mutex with memory_order_acquire and memory_order_release? for more discussion about what in the standard forbids introducing deadlocks or other infinite loops with static reordering. – Thao 24/11, 2022 at 11:7

Any function whose definition is not available in the current translation unit is considered an I/O function. Such calls are assumed to cause side effects and the compiler cannot move following statements to precede the call or preceding statements to follow the call.

[intro.execution]:

Reading an object designated by a volatile glvalue ([basic.lval]), modifying an object, calling a library I/O function, or calling a function that does any of those operations are all side effects, which are changes in the state of the execution environment. Evaluation of an expression (or a subexpression) in general includes both value computations (including determining the identity of an object for glvalue evaluation and fetching a value previously assigned to an object for prvalue evaluation) and initiation of side effects. When a call to a library I/O function returns or an access through a volatile glvalue is evaluated the side effect is considered complete, even though some external actions implied by the call (such as the I/O itself) or by the volatile access may not have completed yet.

And

Every value computation and side effect associated with a full-expression is sequenced before every value computation and side effect associated with the next full-expression to be evaluated.

Here std::thread constructor and std::thread::join are such functions (they eventually call platform specific thread functions unavailable in the current TU) with side effects. stop.store also causes side effects (memory store is a side effect). Hence stop.store cannot be moved prior to std::thread constructor or past std::thread::join calls.

Wont answered 22/5, 2018 at 11:21 Comment(2)

Then how to explain this? I guess std::chrono::high_resolution_clock::now() is the kind of function you are talking about. – Settera 22/5, 2018 at 11:32

@Settera There is already a comprehensive answer to that question. – Wont 22/5, 2018 at 11:56

Recommended topics

Hot tags