Thread.VolatileRead() vs Volatile.Read()
Asked Answered
I

2

34

We are told to prefer Volatile.Read over Thread.VolatileRead in most cases due to the latter emitting a full-fence, and the former emitting only the relevant half-fence (e.g. acquire fence); which is more efficient.

However, in my understanding, Thread.VolatileRead actually offers something that Volatile.Read does not, because of the implementation of Thread.VolatileRead:

public static int VolatileRead(ref int address) {
  int num = address;
  Thread.MemoryBarrier();
  return num;
}

Because of the full memory barrier on the second line of the implementation, I believe that VolatileRead actually ensures that the value last written to address will be read. According to Wikipedia, "A full fence ensures that all load and store operations prior to the fence will have been committed prior to any loads and stores issued following the fence.".

Is my understanding correct? And therefore, does Thread.VolatileRead still offer something that Volatile.Read does not?

Interconnect answered 21/3, 2014 at 20:49 Comment(0)
M
40

I may be a little late to the game, but I would still like to chime in. First we need to agree on some basic definitions.

  • acquire-fence: A memory barrier in which other reads and writes are not allowed to move before the fence.
  • release-fence: A memory barrier in which other reads and writes are not allowed to move after the fence.

I like to use an arrow notation to help illustrate the fences in action. An ↑ arrow will represent a release-fence and a ↓ arrow will represent an acquire-fence. Think of the arrow head as pushing memory access away in the direction of the arrow. But, and this is important, memory accesses can move past the tail. Read the definitions of the fences above and convince yourself that the arrows visually represent those definitions.

Using this notation let us analyze the examples from JaredPar's answer starting with Volatile.Read. But, first let me make the point that Console.WriteLine probably produces a full-fence barrier unbeknownst to us. We should pretend for a moment that it does not to make the examples easier to follow. In fact, I will just omit the call entirely as it is unnecessary in the context of what we are trying to achieve.

// Example using Volatile.Read
x = 13;
var local = y; // Volatile.Read
↓              // acquire-fence
z = 13;

So using the arrow notation we more easily see that the write to z cannot move up and before the read of y. Nor can the read of y move down and after the write of z because that would be effectively same as the other way around. In other words, it locks the relative ordering of y and z. However, the read of y and the write to x can be swapped as there is no arrow head preventing that movement. Likewise, the write to x can move past the tail of the arrow and even past the write to z. The specification technically allows for that..theoretically anyway. That means we have the following valid orderings.

Volatile.Read
---------------------------------------
write x    |    read y     |    read y
read y     |    write x    |    write z
write z    |    write z    |    write x

Now let us move on to the example with Thread.VolatileRead. For the sake of the example I will inline the call to Thread.VolatileRead to make it easier to visualize.

// Example using Thread.VolatileRead
x = 13;
var local = y; // inside Thread.VolatileRead
↑              // Thread.MemoryBarrier / release-fence
↓              // Thread.MemoryBarrier / acquire-fence
z = 13;

Look closely. There is no arrow (because there is no memory barrier) between the write to x and the read of y. That means these memory accesses are still free to move around relative to each other. However, the call to Thread.MemoryBarrier, which produces the additional release-fence, makes it appear as if the next memory access had volatile write semantics. This means the writes to x and z can no longer be swapped.

Thread.VolatileRead
-----------------------
write x    |    read y
read y     |    write x
write z    |    write z

Of course it has been claimed that Microsoft's implementation of the CLI (the .NET Framework) and the x86 hardware already guarantee release-fence semantics for all writes. So in that case there may not be any difference between the two calls. On an ARM processor with Mono? Things might be different in that case.

Let us move on now to your questions.

Because of the full memory barrier on the second line of the implementation, I believe that VolatileRead actually ensures that the value last written to address will be read. Is my understanding correct?

No. This is not correct! A volatile read is not the same as a "fresh read". Why? It is because the memory barrier is placed after the read instruction. That means the actual read is still free to move up or backwards in time. Another thread could write to the address, but the current thread might have already moved the read to a point in time before that other thread committed it.

So this begs the question, "Why do people bother using volatile reads if it seemingly guarantees so little?". The answer is that it absolutely guarantees that the next read will be newer than the previous read. That is its value! That is why a lot of lock-free code spins in a loop until the logic can determine that the operation was completed successfully. In other words, lock-free code exploits the concept that the later read in a sequence of many reads will return a newer value, but the code should not assume that any of the reads necessarily represent the latest value.

Think about this for a minute. What does it even mean for a read to return the latest value anyway? By the time you use that value it might not be the latest anymore. Another thread may have already written a different value to the same address. Can you still call that value the latest?

But, after considering the caveats of what it even means to have a "fresh" read discussed above, you still want something that acts like a "fresh" read then you would need to place an acquire-fence before the read. Note that this is clearly not the same thing as a volatile read, but it would better match a developers intuition of what "fresh" means. However, the term "fresh" in the case is not an absolute. Instead, the read is "fresh" relative to the barrier. That is it cannot be any older than the point in time in which the barrier was executed. But, as was mentioned above, the value may not represent the latest value by the time you use or make decision based on it. Just keep that in mind.

And therefore, does Thread.VolatileRead still offer something that Volatile.Read does not?

Yes. I think JaredPar presented a perfect example of a case where it can offer something additional.

Mousy answered 1/8, 2014 at 20:22 Comment(15)
What are possible applications for this? I mean, you either have thread safe code that is synchronized somehow or you don't. Why would this level of granularity be needed in controlling memory reads/writes? If the answer is too long I'll just ask a new question.Hammurabi
Hi Brian, thanks for your extra clarification. Also, how did you randomly find this question, haha? :) Since I asked it I also learned a bit more, and started investigating MESI etc. After carefully reading through your answer I think these days I'm more or less getting towards a decent understanding; with the caveat that I need to have had caffeine. Anyway, you might also like another question I posted, I actually found it opened up the next level of understanding for me.Interconnect
@Motig: I occasionally search for memory barrier questions and attempt an answer if I'm not satisfied with the existing answers or if I think I can add a different perspective. I favorited your linked question. I'll take a look when I get time.Mousy
@BrianGideon It took me a while, but I understand now what you're saying. I get it: Memory barriers and volatile semantics etc don't provide any immediacy or 'flushing'... Only ordering guarantees. Is that right?Interconnect
@Motig: Yep, that is correct! Though, like I said you can usually exploit those ordering guarantees to get the desired behavior, albeit with a lot of careful thought put into it.Mousy
I realize that this is an older thread but still ... Documentation for Volatile.Read states this: "The value that was read. This value is the latest written by any processor in the computer, regardless of the number of processors or the state of processor cache." So it seems that it does return the latest value at that point in time.Submerse
@dcarapic: The documentation does not match the specification for how the model model works in .NET. There is a disconnect between the documentation for several of the memory barrier generator API calls and the specification. This has been brought up before. The point...take the documentation with a grain salt. In practice, because of the way these methods are used in tight loops, they will return the latest value.Mousy
Whenever I think I figured out the best way to have a thread-safe variable I find out something new sigh. This multi-threading stuff is really annoying ...Submerse
So is declaring a volatile variable giving us the semantics of Thread.VolatileRead or Volatile.Read?Augustaaugustan
@William: I believe it would be closer to Volatile.Read.Mousy
@BrianGideon You said that memory barriers don't enforce "flushing" but only ordering guarantees. I have a code example whose result is affected by whether a "read" that occurs after a "write" (due to a barrier) can see the value by the last "write". Can you please comment on that? #38051181Koph
Does the volatile keyword enforce half fence like the Volatile.Read() and Volatile.Write()?Koph
Awesome article. One thing confused me: ...Nor can the write move down and after the read because that would be effectively same as the other way around... This states that the x write cannot move down after the y read, but your table shows it can. Which makes sense due to the fence being set up after the read.Td
@Brain2000: Ha...yeah. I totally botched that sentence. I don't even know what I was thinking. Anyway, I corrected it. Take a look. Also, I added another paragraph at the end that suggests placing the barrier before the read if you want "fresh"-like semantics.Mousy
@BrianGideon The documentation for Volatile.Read() return value states: The value that was read. This value is the latest written by any processor in the computer, regardless of the number of processors or the state of processor cache. That sounds like a "fresh read" to me, no?Zipangu
D
17

The Volatile.Read essentially guarantees that read and write operations which occur after it cannot be moved before the read. It says nothing about preventing write operations which occur before the read from moving past it. For example

// assume x, y and z are declared 
x = 13;
Console.WriteLine(Volatile.Read(ref y));
z = 13;

There is no guarantee that the write to x occurs before the read of y. However the write to z is guaranteed to occur after the read of y.

// assume x, y and z are declared 
x = 13;
Console.WriteLine(Thread.VolatileRead(ref y));
z = 13;

In this case though you can be guaranteed that the order here is

  • write x
  • read y
  • write z

The full fence prevents both reads and writes from moving across it in either direction

Dihedron answered 21/3, 2014 at 21:6 Comment(11)
It depends on the platform, on x86/x64, Thread.VolatileRead and Volatile.Read have the same result due to the existing strong memory model.Kincaid
@PeterRitchie correct, when you take into account platform it can narrow the constraints.Dihedron
@JeredPar Is there a reference on this? MSDN is very vague, Volatile.Read is the only one that mentions a barrier and only vaguely says "memory barrier". Volatile.Read does say "a read or write appears after this method in the code, the processor cannot move it before this method"--which is what that fence does; but you have to know that in order to compare with "memory barrier". I know you're right; but it would be nice to reference something from Microsoft (and have better Microsoft documentation) rather than something from Stackoverflow :(Kincaid
@PeterRitchie the doc here is pretty specific about the type of fence that Volatile.Read will setup. Were you looking for something more? msdn.microsoft.com/en-us/library/gg712828(v=vs.110).aspxDihedron
@Dihedron Unless you know what "a read or write appears after this method in the code, the processor cannot move it before this method" means, in terms of a fence, it's not specific at all...Kincaid
@PeterRitchie It's not quite proof, but to quote "CLR via C#" that I have in front of me here: "...Any later program-order loads and stores must occur after the call to Volatile.Read". I know it doesn't explicitly say 'acquire fence' - but isn't that pretty much the exact behaviour of one?Interconnect
I don't think this is quite correct. In your second example there is no memory barrier between the write to x and the VolatileRead from Y (of which the implementation shows the read happens BEFORE the memory barrier) So it's still possible x and y could get reordered. Maybe i'm wrong, it's all very confusing stuffAugustaaugustan
@PeterRitchie: "MSDN is very vague." Are the exquisitely subtle distinctions asserted across this page based on anything other than unwarranted scrutiny of hazy docs? As for facts, the mscorlib IL for Thread.VolatileRead vs. Volatile.Read in .NET 4.7.2 are identical: ldarg.0; ldind.i4; stloc.0; call void Thread::MemoryBarrier(); ldloc.0; ret; (attributes differ: MethodImplOptions.NoInlining vs. CER). I know there's special-case JIT meddling involved here, but can anyone point to actual code in src/vm that backs up the claim that there's any difference between these two calls?Forsberg
In case my point wasn't clear... We all know that this particular aspect of concurrency is especially subtle/exacting and requires precise information and extreme attention to detail. The MSDN docs in the case here patently fail to respect or reach the requisite level of seriousness/credibility, so I don't see how they can be of any use whatsoever. The ECMA technical specs might be better; has anyone found anything related to this in those? But again, inherent complexities may likely render them insufficiently precise as well, such that nothing short of empirical JIT results will be conclusiveForsberg
@GlennSlayden w.r.t. IL, what processor was used, and what where the Build/Platform target settings. What is needed differs from x86 to x64.Kincaid
@GlennSlayden Also, IIRC, Ecma only mentions Thread.VolatileRead and Thread.VolatileWrite--which define them as being equivalent as using the IL volatile. prefix. This means VolatileRead provides "acquire semantics" and VolatileWrite provides "release semantics" with no extra atomicity guarantees. Volatile.Read and Volatile.Write just explicitly call Thread.MemoryBarrior after the read and before the write. I haven't found anything that says that's identical to volatile or VoliatileRead/VolatileWrite. :( But, they're all documented in a way that can be inferred as doing the same thing.Kincaid

© 2022 - 2024 — McMap. All rights reserved.