[intro.progress]/18:
An implementation should ensure that the last value (in modification
order) assigned by an atomic or synchronization operation will become
visible to all other threads in a finite period of time.
[atomics.order]/12:
Implementations should make atomic stores visible to atomic loads
within a reasonable amount of time.
This is a non-binding recommendation. If your implementation follows them - as high-quality implementations should - you are fine. Otherwise, you are screwed. In both cases regardless of the memory order used.
The C++ abstract machine has no concept of "reordering". In the abstract semantics, the main thread stored into the atomic and then blocked, and so if the implementation makes the store visible to loads within a finite amount of time, then the other thread will load this stored value within a finite amount of time and terminate. Conversely, if the implementation doesn't do so for whatever reason, then your other thread will loop forever. The memory order used is irrelevant.
I've never found reasoning about "reordering" to be useful. It mixes up low-level implementation detail with a high-level memory model, and tends to make things more confusing, not less.
t
synchronizes with successful return fromt.join()
, which does not help much. – Setterastop
is a global variable, I believe the compiler will emit the code forstop.store()
before the call tot.join()
. In the other hand, I think the processor will be allowed to defer the visibility of thestore
operation. – Tattler