Let's assume we have a memory area where some thread is writing data to. It then turns its attention elsewhere and allows arbitrary other threads to read the data. However, at some point in time, it wants to reuse that memory area and will write to it again.
The writer thread supplies a boolean flag (valid
), which indicates that the memory is still valid to read from (i.e. he is not reusing it yet). At some point he will set this flag to false and never set it to true again (it just flips once and that is it).
With sequential consistency, it should be correct to use these two code snippets for the writer and the readers, respectively:
...
valid = false;
<write to shared memory>
...
and
...
<read from shared memory>
if (valid) {
<be happy and work with data read>
} else {
<be sad and do something else>
}
...
We obviously need to do something to ensure sequential consistency, namely insert the necessary acquire and release memory barriers. We want the flag to be set to false in the writer thread, before writing any data to the segment. And we want the data to be read from memory in the reader threads before checking valid
. The later because we know valid to be monotonic, i.e., if it is still valid after reading, it was valid while reading.
Inserting a full fence between memory access and the access to valid
will do the trick. I wonder, however, if making valid
an atomic will be enough?
std::atomic<bool> valid = true;
Then
...
valid.store(false); // RELEASE
<write to shared memory>
...
and
...
<read from shared memory>
if (valid.load()) { // ACQUIRE
<be happy and work with data read>
} else {
<be sad and do something else>
}
...
It seems that in this scenario, the implied release and acquire operations from using the the atomic store and read work against me. The RELEASE in the writer does not prevent the memory access to be moved up over it (just code from above may not be moved down). And similarly, the ACQUIRE in the readers does not prevent the memory access to be moved down over it (just code from below may not be moved up).
If this is true, to make this scenario work, I would need an ACQUIRE (i.e. a load) in the writer thread as well and a RELEASE (i.e. store) in the reader threads. Alternatively, I could just use a normal boolean flag and protect the write and read access (to it only!) in the threads with a shared mutex. By doing so I would effectively also have both an ACQUIRE and a RELEASE in both threads, separating the valid
access from the memory access.
So this would be a very severe difference between atomic<bool>
and a regular bool
protected by a mutex
, is this correct?
Edit: There actually seems to be a difference in what is implied by a load and a store on atomics. The std::atomic
of C++11 uses memory_order_seq_cst
for both (!), rather than memory_order_acquire
and memory_order_release
for load and store respectively.
In contrast, the tbb::atomic
uses memory_semantics::acquire
and memory_semantics::release
rather than memory_semantics::full_fence
.
So if my understanding is correct, the code would be correct with standard C++11 atomics, but with tbb atomics one would need to add the explicit memory_semantics::full_fence
template parameter to both load and store.
valid = true;
when it finishes writing and the reader to setvalid = false;
when it's finished reading? – Ughstd::atomic
, at least after watching Herb Sutter's talk about atomic<> weapons :-) – Indispensable