Half - fences and full fences? [duplicate]
Asked Answered
D

3

20

I've been reading that Full fences prevents any kind of instruction reordering or caching around that fence ( via memoryBarrier)

Then I read about volatile which generates “half-fences” :

The volatile keyword instructs the compiler to generate an acquire-fence on every read from that field, and a release-fence on every write to that field.

acquire-fence

An acquire-fence prevents other reads/writes from being moved before the fence;

release-fence

A release-fence prevents other reads/writes from being moved after the fence.

Can someone please explain me these 2 sentences in simple English ?

(where is the fence ?)

edit

After some answers here - I've made a drawing which can help everyone - I think.

https://i.sstatic.net/A5F7P.jpg enter image description here

Drainpipe answered 14/5, 2012 at 19:14 Comment(2)
Unless I miss understood the documentation, volatile variables do not require all writes to be finished. Instead they require all reads to get the current value, rather then using a buffered value which may not be current if another thread has changed the variable.Gleich
Yep that drawing pretty much summarizes the purpose of acquire-release fences.Calamander
H
22

The wording you refer to looks like that which I often use. The specification says this though:

  • A read of a volatile field is called a volatile read. A volatile read has "acquire semantics"; that is, it is guaranteed to occur prior to any references to memory that occur after it in the instruction sequence.
  • A write of a volatile field is called a volatile write. A volatile write has "release semantics"; that is, it is guaranteed to happen after any memory references prior to the write instruction in the instruction sequence.

But, I usually use the wording you cited in your question because I want to put the focus on the fact that instructions can be moved. The wording you cited and the specification are equivalent.

I am going to present several examples. In these examples I am going to use a special notation that uses an ↑ arrow to indicate a release-fence and a ↓ arrow to indicate an acquire-fence. No other instruction is allowed to float down past an ↑ arrow or up past an ↓ arrow. Think of the arrow head as repelling everything away from it.

Consider the following code.

static int x = 0;
static int y = 0;

static void Main()
{
  x++
  y++;
}

Rewriting it to show the individual instructions would look like this.

static void Main()
{
  read x into register1
  increment register1
  write register1 into x
  read y into register1
  increment register1
  write register1 into y
}

Now, because there are no memory barriers in this example the C# compiler, JIT compiler, or hardware is free to optimize it in many different ways as long as the logical sequence as perceived by the executing thread is consistent with the physical sequence. Here is one such optimization. Notice how the reads and writes to/from x and y got swapped.

static void Main()
{
  read y into register1
  read x into register2
  increment register1
  increment register2
  write register1 into y
  write register2 into x
}

Now this time will change those variables to volatile. I will use our arrow notation to mark the memory barriers. Notice how the order of the reads and writes to/from x and y are preserved. This is because instructions cannot move past our barriers (denoted by the ↓ and ↑ arrow heads). Now, this is important. Notice that the increment and write of x instructions were still allowed to float down and the read of y floated up. This is still valid because we were using half fences.

static volatile int x = 0;
static volatile int y = 0;

static void Main()
{
  read x into register1
  ↓    // volatile read
  read y into register2
  ↓    // volatile read
  increment register1
  increment register2
  ↑    // volatile write
  write register1 into x
  ↑    // volatile write
  write register2 into y
}

This is a very trivial example. Look at my answer here for a non-trivial example of how volatile can make a difference in the double-checked pattern. I use the same arrow notation I used here to make it easy to visualize what is happening.

Now, we also have the Thread.MemoryBarrier method to work with. It generates a full fence. So if we used our arrow notation we can visualize how that works as well.

Consider this example.

static int x = 0;
static int y = 0;

static void Main
{
  x++;
  Thread.MemoryBarrier();
  y++;
}

Which then looks like this if we are to show the individual instructions as before. Notice that instruction movement is prevented altogether now. There is really no other way this can get executed without compromising the logical sequence of the instructions.

static void Main()
{
  read x into register1
  increment register1
  write register1 into x
  ↑    // Thread.MemoryBarrier
  ↓    // Thread.MemoryBarrier
  read y into register1
  increment register1
  write register1 into y
}

Okay, one more example. This time let us use VB.NET. VB.NET does not have the volatile keyword. So how can we mimic a volatile read in VB.NET? We will use Thread.MemoryBarrier.1

Public Function VolatileRead(ByRef address as Integer) as Integer
  Dim local = address
  Thread.MemoryBarrier()
  Return local
End Function

And this is what it looks like with our arrow notation.

Public Function VolatileRead(ByRef address as Integer) as Integer
  read address into register1
  ↑    // Thread.MemoryBarrier
  ↓    // Thread.MemoryBarrier
  return register1
End Function

It is important to note that since we want to mimic a volatile read the call to Thread.MemoryBarrier must be placed after the actual read. Do not fall into the trap of thinking that a volatile read means a "fresh read" and a volatile write means a "committed write". That is not how it works and it certainly is not what the specification describes.

Update:

In reference to the image.

wait! I am verifing that all the Writes are finished!

and

wait! I am verifying that all the consumers have got the current value!

This is the trap I was talking about. The statements are not completely accurate. Yes, a memory barrier implemented at the hardware level may synchronize the cache coherency lines and as a result the statements above may be somewhat accurate acount of what happens. But, volatile does nothing more than restrict the movement of instructions. The specification says nothing about loading a value from memory or storing it to memory at the spot where the memory barrier is place.


1There is, of course, the Thread.VolatileRead builtin already. And you will notice that it is implemented exactly as I have done here.

Hanker answered 14/5, 2012 at 20:57 Comment(18)
@KaseySpeakman:Thanks. Be sure to read my answer here for a similar explanation except in that case it is about the double-checked locking pattern. I have to be honest. It took me a really long time before all of this memory barrier stuff finally "clicked" with me.Hanker
Brian can you please have a look at my question ?(and of course post an answer , I know you are a master at threading). https://mcmap.net/q/456183/-deadlock-clarificationDrainpipe
@RoyiNamir: Ask and ye shall receive. Done.Hanker
@BrianGideon Correct the VB.NET example by moving Thread.MemoryBarrier() before Return address.Unpleasant
@HamletHakobyan: Oh, but it already IS correct the way I have it. I explain why it has to be this way is in my answer. But, in a nutshell, it is because a volatile read absolutely does NOT mean a "fresh read". You can verify the order of the statements by decompiling the real Thread.VolatileRead method yourself. I know, it's really confusing isn't it!Hanker
@BrianGideon I think you are missed something. It will be looks like this: Public Function VolatileRead(ByRef address as Integer) as Integer Dim i as Integer = address Thread.MemoryBarrier() Return i End FunctionUnpleasant
@HamletHakobyan: Yes, of course. I totally missed that. I will fix that when I get a chance. Nice catch! And thanks for finding it.Hanker
@HamletHakobyan: Okay, I had a chance to take a closer look at that the way I originally had it. Oddly enough, VB.NET allows you put statements after a Return and it will compile without errors or warnings. But, it won't execute any statement appearing after a Return because it injects a br.s IL instructions to jump over them. Strange. You'd think there would at least be a warning.Hanker
@BrianGideon Brian - in your VB example , how can it possibly be that instructions would swap ? even with a single thread POV , changing the order would mean completely different code. ( im talking about Dim local = address; Return local). it is not like the x++ , y++ which I do understand that with single thread POV - it doesn't matter who is first.Drainpipe
@RoyiNamir: Imagine what would happen without the call to Thread.MemoryBarrier. The compiler could then simplify the code to a single line Return address. And then going further and because the function is simple it would likely be inlined essentially transforming the entire function call into just a normal memory read subject to all of the common reordering strategies now.Hanker
@BrianGideon Hi Brian - is there a way how to guarantee freshness then? I have posted a question on this if you are interested to take a look: #24678273 Thanks!Mishamishaan
Brian , Reading it now - I ask myself ( without knowing) : why does read uses while write uses ? I mean - if I swap(!!) the definition above , I can easily find a reason which would also make sense : "A volatile read has "acquire semantics"; that is, it is guaranteed to occur after to any references to memory that occur before it in the instruction sequence" - why ? Because I would want to see the most updated(!!!) value.Drainpipe
...continue (from last comment)....Same goes for write : "A volatile write has "release semantics"; that is, it is guaranteed to happen before any memory references after to the write instruction in the instruction sequence." - why ? because I want the write operation to occur where it is written. not after. What am i missing?Drainpipe
@RoyiNamir: If you want a "fresh read" then you'll have to put a call to Thread.MemoryBarrier before the read. But, that's not the same thing as a volatile read. What a volatile read guarantees is that the next read will be newer than the previous. It does not guarantee that the current read will return the latest value.Hanker
Thanks @BrianGideon, I learned a lot! I'm a bit confused why you even need the MemoryBarrier in a function that only reads the variable. Nothing can move up or down because the function only contains one line. Or might that function potentially be optimized out if that's all it does? Let me go the other way then, would it matter if the MemoryBarrier comes before the read instead of after?Serge
@Brain2000: Yeah, assume any method could be inlined. But even without that once everything gets JIT compiled into native instructions the concept of method calls becomes blurred anyway. And yes, it does matter if the memory barrier generator is called before or after.Hanker
@BrianGideon Good point regard trapping picture. “The volatile. prefix on certain instructions shall guarantee cross-thread memory ordering rules.“ No other capabilities it has.Theta
This statement, "The specification says nothing about loading a value from memory or storing it to memory at the spot where the memory barrier is place.", seems misleading to me. In particular, volatile semantics concerns not just instruction ordering, but also visibility of side-effects between processors. See e.g. stackoverflow.com/a/44832155. Your statement seems to imply (perhaps unintentionally?) that a thread could still read a stale value even after a volatile read of a value written by a volatile write that occurred after the write you want to observe, which wouldn't be truePeriosteum
G
4

Start from the other way:

What is important when you read a volatile field? That all previous writes to that field has been committed.

What is important when you write to a volatile field? That all previous reads has already got their values.

Then try to verify that the acquire-fence and release-fence makes sense in those situations.

Grisaille answered 14/5, 2012 at 19:19 Comment(0)
C
2

To reason more easy about this, let's assume a memory model where any reordering is possible.

Let's see a simple example. Assume this volatile field:

volatile int i = 0;

and this sequence of read-write:

1. int a = i;
2. i = 3;

For instruction 1, which is a read of i, an acquire fence is generated. That means that instruction 2, which is a write to i cannot be reordered with instruction 1, so there is no possibility that a will be 3 at the end of the sequence.

Now, of course, the above does not make much sense if you consider a single thread, but if another thread were to operate on the same values (assume a is global):

thread 1               thread 2
a = i;                 b = a;
i = 3;

You'd think in this case that there is no possibility for thread 2 to get a value of 3 into b (since it would get the value of a before or after the assignment a = i;). However, if the read and write of i get reordered, it is possible that b gets the value 3. In this case making i volatile is necessary if your program correctness depends on b not becoming 3.

Disclaimer: The above example is only for theoretical purposes. Unless the compiler is completely crazy, it would not go and do reorderings that could create a "wrong" value for a variable (i.e. a could not be 3 even if i were not volatile).

Calamander answered 14/5, 2012 at 19:21 Comment(2)
@Royi Namir: For a more realistic example of why fences are important have a look here: cs.umd.edu/~pugh/java/memoryModel/jsr-133-faq.html#whatismm. In my example I used a write reordered with a read of the same variable, which cannot really happen in practice, but a read reordered with a read of different variables can happen.Calamander
According to my draw (and @Brian Gideon description from the specification)- when step 1 is occurring , an ACQUIRE fence is generated and not release fence....Drainpipe

© 2022 - 2024 — McMap. All rights reserved.