Is a read on an atomic variable guaranteed to acquire the current value of it in C++11?
Asked Answered
G

3

1

It is known that the modifications on a single atomic variable form a total order. Suppose we have an atomic read operation on some atomic variable v at wall-clock time T. Then, is this read guaranteed to acquire the current value of v that is wrote by the last one in the modification order of v at time T? To put it in another way, if an atomic write is done before an atomic read in natural time, and there is no other writes in between, then is the read guaranteed to return the value just written?

My accepted answer is the 6th comment made by Cubbi to his answer.

Garold answered 13/8, 2014 at 15:35 Comment(6)
isn't that the whole point of being atomic?Rivalee
After studying the complexities that may be involved in memory order, it does not seem apparent to me now.Garold
@Rivalee Isn't the whole point of parallel MT to not have a global synchronous time?Whaleback
@Whaleback the 'whole' point I wouldn't say but AFAICT the OP used the concept of wall time just to express the concept of before/afterRivalee
@Rivalee Before/after is fully defined only on a given thread (and signals in that thread). Among threads there are different very abstract orders that must match with threads and sometimes with each others...Whaleback
Related: Is a memory barrier required to read a value that is atomically modified? / Does hardware memory barrier make visibility of atomic operations faster in addition to providing necessary guarantees? (no, and RMWs aren't faster either). This part of the standard is how the formalism guarantees atomicity, that another write can't happen on this variable between the load and store. Nothing more, nothing less.Emmalynne
O
2

Wall-clock time is irrelevant. However, what you're describing sounds like the write-read coherence guarantee:

$1.10[intro.multithread]/20

If a side effect X on an atomic object M happens before a value computation B of M, then the evaluation B shall take its value from X or from a side effect Y that follows X in the modification order of M.

(translating the standardese, "value computation" is a read, and "side effect" is a write)

In particular, if your relaxed write and your relaxed read are in different statements of the same function, they are connected by a sequenced-before relationship, therefore they are connected by a happens-before relationship, therefore the guarantee holds.

Oasis answered 13/8, 2014 at 16:4 Comment(11)
My question lies in that a write inter-thread happens-before a read when the value computation reads the value of the write. But is there any conditions that when satisfied this actually happens (otherwise, this may never happen)? I am considering whether happens-before in terms of wall-clock time is such a condition.Garold
The necessary condition is the definition of happens-before. It does not involve wall clock.Oasis
I think the following example may make my question clear. Consider two threads synchronized by an atomic spin-lock. Thread A first acquires the lock. Thread B is then busy waiting. Next, Thread A releases the lock. Is there any clue about when thread B sees this release by thread A and thus enters the critical section, or it may just be waiting infinitely long to see this release?Garold
@Lingxi, the clue is memory order semantics. The store in A has to be memory_order_release at least, the load has to be memory_order_acquireCerebration
@Cerebration I agree with you in the memory order semantics that must be used, and the memory_order_release write synchronizes-with the memory_order_acquire read when it reads the value written by that write. But again, when does this happen?Garold
@Garold 29.3[atomics.order]/12 "Implementations should make atomic stores visible to atomic loads within a reasonable amount of time."Oasis
@Oasis One way to force the read of the newest current value seems to be using an atomic read-modify-write operation. See this stackoverflow QA link.Garold
@Garold sure, and that's why RMWs take so long (CPU has to sit and wait for all the acks). I am not sure I see the relationship with the original question, though: RMW is a different beast from atomic read or atomic write.Oasis
@Oasis "CPU has to sit and wait for all the acks" No. There is no such thing as wait for acks. Acks are not a thing you wait for or that propagates. The cache does the same job for a simple write. And reads from the cache are always current. The only specific issue w/ RMW is to keep the data in the cache during the whole operation.Whaleback
@Garold "force the read of the newest current value seems to be using an atomic read-modify-write operation" There is no language semantics basis for that claim.Whaleback
@Whaleback here's a good read on what ACKs go where: researchspace.auckland.ac.nz/bitstream/handle/2292/11594/…?Oasis
G
2

Is a read on an atomic variable guaranteed to acquire the current value of it

No

Even though each atomic variable has a single modification order (which is observed by all threads), that does not mean that all threads observe modifications at the same time scale.

Consider this code:

std::atomic<int> g{0};

// thread 1
g.store(42);

// thread 2
int a = g.load();
// do stuff with a
int b = g.load();

A possible outcome is (see diagram):

  • thread 1: 42 is stored at time T1
  • thread 2: the first load returns 0 at time T2
  • thread 2: the store from thread 1 becomes visible at time T3
  • thread 2: the second load returns 42 at time T4.

enter image description here

This outcome is possible even though the first load at time T2 occurs after the store at T1 (in clock time).

The standard says:

Implementations should make atomic stores visible to atomic loads within a reasonable amount of time.

It does not require a store to become visible right away and it even allows room for a store to remain invisible (e.g. on systems without cache-coherency). In that case, an atomic read-modify-write (RMW) is required to access the last value.

Atomic read-modify-write operations shall always read the last value (in the modification order) written before the write associated with the read-modify-write operation.

Needless to say, RMW's are more expensive to execute (they lock the bus) and that is why a regular atomic load is allowed to return an older (cached) value.
If a regular load was required to return the last value, performance would be horrible while there would be hardly any benefit.

Gustavogustavus answered 19/4, 2017 at 21:43 Comment(6)
In which practical case can a regular load returns an old obsolete value? Where was that old value stored?Whaleback
@Whaleback in the cache, if it is not coherent.. You can also consider a load obsolete if a stored value (other core) is still in the store buffer (X86).. Problem is that there is no definition of current or latest value for regular stores and loads. Latest in the modification order really only applies to RMW operationsGustavogustavus
Which modern architecture (that has some form of C++ or C++-like compiler) has a non coherent cache?Whaleback
@Whaleback No idea, but cache coherency is not a mandatory featureGustavogustavus
It seems to me that coherency is essential to support any common PL with reasonable efficiency... is there at least a prototype compiler of C++ on a theoretical system w/o cache coherency?Whaleback
Cache coherency is a hardware feature. Platforms that do not support it can probably be found in the embedded world.Gustavogustavus
C
0

Depends on the memory order which you can specify for the load() operation.

By default, it is std::memory_order_seq_cst and the answer is yes, it guarantees the current value stored by another thread (if stored at all, i.e. it must use std::memory_order_release memory order at least, otherwise the store visibility is not guaranteed).

But if you specify std::memory_order_relaxed for the load operation the documentation says Relaxed ordering: there are no synchronization or ordering constraints, only atomicity is required of this operation. I.e. the program could end up not reading from the memory at all.

Cerebration answered 13/8, 2014 at 16:5 Comment(4)
I do not think memory_order_seq_cst actually guarantees this. It does guarantee that a global total order exists, but does not say clearly how this order may be formed, i.e., this rules of constructing such an order.Garold
The rules are described for the memory order. The implementation mechanism is to prevent omitting and reordering of operations before and after this atomic load (or store) in both compiler and processor sides. Thus it is guaranteed the store will happen in the right place and the load will happen in the right place among other operations; and the processor will execute these operations.Cerebration
@Garold Consistency implies that the order of any atomic op (from seq_cst to relaxed) must follow execution order in a given thread (you don't go back in time); acquire and release of the same objects by multiple threads "ties" the order of operations together and force them to go along. OTOH if you don't have atomic objects shared between threads, you have no such ties. But then why would you care about the abstract order?Whaleback
"It does guarantee that a global total order exists" Global order of operations inside unrelated threads that share nothing is an abstract intellectual device that has no physical counterpart. You must consider the threads that do interact via atomic objects: in practice these will be implemented with fences that are made visible in some order, and atomic stores are implemented as real memory stores that put a cache line in an exclusive modified state which determine a modification order on the objects in that cache line. Fences make these modifications globally consistent.Whaleback

© 2022 - 2024 — McMap. All rights reserved.