acquire-release pair out of order execution
Asked Answered
R

1

7

I'm thinking of whether or not it is possible for atomic variable to load the old value in acquire-release pair. Let's suppose we have atomic variable x, and we store that variable with release semantics and later load it with acquire semantics is it possible in theory to read the old value?

std::atomic<int> x = 0;

void thread_1()
{
   x.store(1, std::memory_order_release);
}
void thread_2()
{
   assert(x.load(std::memory_order_acquire) != 0);
}

if function thread 1 is finished when thread 2 loads the x (so the new value is stored) is it possible for thread 2 to load old value from x? In other words if actual store to x is done before the load is it possible for assert to fire?

As far as I understood from articles in internet it is possible, but I cannot understand why. Memory fence generated by store to x guaranties to empty store buffer, while acquire memory fence in load from x is guaranteed to invalidate cache line, so it has to read up-to-date value.

added

Does it mean that acquire-release by itself doesn't have any enforced ordering? It's only anything that was done before release will happen before release and everything that is done after acquire will happen after it, so acquire-release pair enforces ordering on the other operations (why??). Did I get it right? Does it mean that in the code bellow assert is guaranteed to do not fire

std::atomic<int> x = 0;
std::atomic<int> y = 0;

void thread_1()
{
   y.store(1, std::memory_order_relaxed);
   x.store(1, std::memory_order_release);
}
void thread_2()
{
   x.load(std::memory_order_acquire);
   assert(y.load(std::memory_order_relaxed) != 0);
}

of course again if the thread 1 was already finished the store. If we replace x.load with while (x.load() == 0) this will 100% work, but I don't know what causes this to work.

And what if I replace the code with code bellow

std::atomic<int> x = 0;

void thread_1()
{
   x.exchange(1, std::memory_order_acq_rel);
}
void thread_2()
{
   assert(x.exchange(0, std::memory_order_acq_rel) != 0);
}

Does it change anything?

Thanks.

Replenish answered 14/12, 2010 at 18:45 Comment(3)
I want to know whether or not it is guaranteed to do not fire (of course in case if actual store was done before load).Replenish
yes, your edit had already cleared that up. My comment was written before your edit was displayed.Eudiometer
When your shared memory state consist in exactly std::atomic<int> x = 0; it doesn't matter what memory order you use!Hesitation
E
6

You might consider store/load functions with release/acquire memory order as the following pseudo-code:

template<class T>
struct weak_atomic
{
   void store(T newValue)
   {
      ReleaseBarrier();
      m_value = newValue;
   }

   T load()
   {
      T value = m_value;
      AcquireBarrier();
      return value;      
   }

   volatile T m_value;
}

You said

Memory fence generated by store to x guaranties to empty store buffer

As I understand, the release memory barrier will cause the CPU to flush its store buffer, but it will be done before applying new value to x. So, it seems possible to read old value from x by another CPU.

Anyway, weak atomics is very complex area. Make sure you understand memory barriers before proceeding with lock-free programming.

ADDED

It seems you are still confused with memory barriers. This is a pretty common example of their usage.

volatile int  x;
volatile bool ok;

void thread_1()
{
   x = 100;
   ok = true;
}

void thread_2()
{
   if (ok)
   {
      assert(x == 100);
   }
}

Due to out-of-order execution you may get the following sequence:

thread 1 sets ok to true
thread 2 checks ok is true and reads some garbage from x
thread 1 sets x to 100 but it is too late

Another possible sequence:

thread 2 reads some garbage from x
thread 2 checks for ok value

We may fix that with release and acquire memory barriers.

volatile int  x;
volatile bool ok;

void thread_1()
{
   x = 100;
   ReleaseBarrier();
   ok = true;
}

void thread_2()
{
   if (ok)
   {
      AcquireBarrier();
      assert(x == 100);
   }
}

ReleaseBarrier() guarantees that memory writes can't jump over the barrier. It means that ok is only set to true when x already contains valid value.

AcquireBarrier() guarantees that memory reads can't jump over the barrier. It means that the value of x is only read after checking ok state.

This is how release/acquire pair is intended to be used. We can rewrite this example with my weak_atomic.

volatile int  x;
weak_atomic<bool> ok;

void thread_1()
{
   x = 100;
   ok.store(true);
}

void thread_2()
{
   if (ok.load())
   {
      assert(x == 100);
   }
}
Estuarine answered 14/12, 2010 at 19:48 Comment(10)
Hm. We can replace pseudocode with the real one. We can put std::atomic_thread_fence(std::memory_order_release) there and make the operation itself "relaxed". Actually you are right, fence is applied before, I didn't think about that, but I don't get how does this make sense. Release barrier is supposed to be there to post the changes, but it actually does nothing. I update my question above.Replenish
@axl: I've added an example of how release/acquire pair is intended to be used. I hope it helps.Estuarine
@Stas: Actually I meant something different, probably I failed in explaining that, because once I read my question today I was confused as well :) The classic example of memory barriers is using while loop, which is absent in my example. My question was that: is there guarantee, that load, tagged with memory_order_acquire will immediately see the store, tagged with memory_order_release (like with sequentially-consistent model).Replenish
It's really difficult to explain what exactly I don't understand :) I read lot of articles and literature about fences and atomics and seems that I've understood everything written there, but something is missing, something I don't get, but it's difficult even to understand what exactly. I'm waiting for a moment when I'll find this missing bit of information, understand the whole picture and say AHA!!! I got it! :) Sometimes in happens.Replenish
@axl: Possible, reading that document will help you rdrop.com/users/paulmck/scalability/paper/whymb.2010.06.07c.pdfEstuarine
@Stas: Thanks. If you know any other resources that contain such detailed explanation please send me the link. Thank you for answer.Replenish
"release memory barrier will cause the CPU to flush its store buffer" No flushing needed (in general), but the operations don't get reordered in such way that the stored value (in the atomic) is visible before the content of the store buffer filled in at that pointHesitation
@Replenish I think it's not a proper way to take acquire-release as "can load-acquire see store-release immediately?". Google "store buffer litmus test" and you'll see there is no such "immediately". When you say "immediately", you probably mean that store-release "happens before" load-acquire. There is no such "happens before" relationship unless you explicitly establish one. One possible way establishing it is to while loop load-acquire "yet another flag" in one thread and store-release that flag in the other thread. Things around this "yet another flag" is what acquire/release meant to be.Gesticulation
@Replenish > The classic example of memory barriers is using while loop, which is absent in my example. Then add one. AFAIK, the only other way to trigger loading of x, after storing of x is by using callbacks.Fifield
Thanks! This answer together with preshing.com/20120913/acquire-and-release-semantics help me a lotHeartrending

© 2022 - 2024 — McMap. All rights reserved.