When should the volatile keyword be used in C#?
Asked Answered
C

12

373

Can anyone provide a good explanation of the volatile keyword in C#? Which problems does it solve and which it doesn't? In which cases will it save me the use of locking?

Condolent answered 16/9, 2008 at 13:39 Comment(1)
Why do you want to save on the use of locking? Uncontended locks add a few nanoseconds to your program. Can you really not afford a few nanoseconds?Chantell
G
313

I don't think there's a better person to answer this than Eric Lippert (emphasis in the original):

In C#, "volatile" means not only "make sure that the compiler and the jitter do not perform any code reordering or register caching optimizations on this variable". It also means "tell the processors to do whatever it is they need to do to ensure that I am reading the latest value, even if that means halting other processors and making them synchronize main memory with their caches".

Actually, that last bit is a lie. The true semantics of volatile reads and writes are considerably more complex than I've outlined here; in fact they do not actually guarantee that every processor stops what it is doing and updates caches to/from main memory. Rather, they provide weaker guarantees about how memory accesses before and after reads and writes may be observed to be ordered with respect to each other. Certain operations such as creating a new thread, entering a lock, or using one of the Interlocked family of methods introduce stronger guarantees about observation of ordering. If you want more details, read sections 3.10 and 10.5.3 of the C# 4.0 specification.

Frankly, I discourage you from ever making a volatile field. Volatile fields are a sign that you are doing something downright crazy: you're attempting to read and write the same value on two different threads without putting a lock in place. Locks guarantee that memory read or modified inside the lock is observed to be consistent, locks guarantee that only one thread accesses a given chunk of memory at a time, and so on. The number of situations in which a lock is too slow is very small, and the probability that you are going to get the code wrong because you don't understand the exact memory model is very large. I don't attempt to write any low-lock code except for the most trivial usages of Interlocked operations. I leave the usage of "volatile" to real experts.

For further reading see:

Gaddis answered 8/7, 2013 at 15:34 Comment(7)
I would down vote this if I could. There's a lot of interesting information in there, but it doesn't really answer his question. He is asking about the usage of the volatile keyword as it relates to locking. For quite awhile (prior to 2.0 RT), the volatile keyword was necessary to use in order to properly make a static field thread safe if the field instance had any initialization code in the constructor (see AndrewTek's answer). There's a lot of 1.1 RT code still in production environments and the devs that maintain it should know why that keyword is there and if its safe to remove.Cormac
@PaulEaster the fact that it can be used for doulbe-checked locking (usually in the singleton pattern) doesn't mean that it should. Relying on the .NET memory model is probably a bad practice - you should rely on the ECMA model instead. For example, you might want to port to mono one day, which may have a different model. I am also to understand that different hardware architechtures could change things. For more information see: https://mcmap.net/q/93392/-memory-model-guarantees-in-double-checked-locking. For better singleton alternatives (for all .NET versions) see: csharpindepth.com/articles/general/singleton.aspxGaddis
I didn't say one way or another whether it should be used. If part of your job is to maintain a legacy 1.1 application, you can't always apply best practices/models from the current runtime. It may in fact be necessary to keep the keyword within the code if it can't be ported over to a newer runtime. Also, you should rely on the memory model you will be running in production. If your application is having memory issues, telling your client that its written that way in case its ever ported to mono will be the last thing you ever tell them. It won't matter if your code is academically correct.Cormac
In other words, the correct answer to the question is: If your code is running in the 2.0 runtime or later, the volatile keyword is almost never needed and does more harm than good if used unnecessarily. But in earlier versions of the runtime, it IS needed for proper double check locking on static fields.Cormac
@PaulEaster Yes, but I maintain that double-checked locking itself is not needed, and can be replaced with a safe alternative from the article I linked to (specifically, solution 5).Gaddis
does this mean locks and volatile variables are mutually exclusive in the following sense: if I have used locks around some variable there is no need to declare that variable as volatile anymore?Pitchy
@Giorgi yes - the memory barriers guaranteed by volatile will be there by virtue of the lockGaddis
W
54

If you want to get slightly more technical about what the volatile keyword does, consider the following program (I'm using DevStudio 2005):

#include <iostream>
void main()
{
  int j = 0;
  for (int i = 0 ; i < 100 ; ++i)
  {
    j += i;
  }
  for (volatile int i = 0 ; i < 100 ; ++i)
  {
    j += i;
  }
  std::cout << j;
}

Using the standard optimised (release) compiler settings, the compiler creates the following assembler (IA32):

void main()
{
00401000  push        ecx  
  int j = 0;
00401001  xor         ecx,ecx 
  for (int i = 0 ; i < 100 ; ++i)
00401003  xor         eax,eax 
00401005  mov         edx,1 
0040100A  lea         ebx,[ebx] 
  {
    j += i;
00401010  add         ecx,eax 
00401012  add         eax,edx 
00401014  cmp         eax,64h 
00401017  jl          main+10h (401010h) 
  }
  for (volatile int i = 0 ; i < 100 ; ++i)
00401019  mov         dword ptr [esp],0 
00401020  mov         eax,dword ptr [esp] 
00401023  cmp         eax,64h 
00401026  jge         main+3Eh (40103Eh) 
00401028  jmp         main+30h (401030h) 
0040102A  lea         ebx,[ebx] 
  {
    j += i;
00401030  add         ecx,dword ptr [esp] 
00401033  add         dword ptr [esp],edx 
00401036  mov         eax,dword ptr [esp] 
00401039  cmp         eax,64h 
0040103C  jl          main+30h (401030h) 
  }
  std::cout << j;
0040103E  push        ecx  
0040103F  mov         ecx,dword ptr [__imp_std::cout (40203Ch)] 
00401045  call        dword ptr [__imp_std::basic_ostream<char,std::char_traits<char> >::operator<< (402038h)] 
}
0040104B  xor         eax,eax 
0040104D  pop         ecx  
0040104E  ret              

Looking at the output, the compiler has decided to use the ecx register to store the value of the j variable. For the non-volatile loop (the first) the compiler has assigned i to the eax register. Fairly straightforward. There are a couple of interesting bits though - the lea ebx,[ebx] instruction is effectively a multibyte nop instruction so that the loop jumps to a 16 byte aligned memory address. The other is the use of edx to increment the loop counter instead of using an inc eax instruction. The add reg,reg instruction has lower latency on a few IA32 cores compared to the inc reg instruction, but never has higher latency.

Now for the loop with the volatile loop counter. The counter is stored at [esp] and the volatile keyword tells the compiler the value should always be read from/written to memory and never assigned to a register. The compiler even goes so far as to not do a load/increment/store as three distinct steps (load eax, inc eax, save eax) when updating the counter value, instead the memory is directly modified in a single instruction (an add mem,reg). The way the code has been created ensures the value of the loop counter is always up-to-date within the context of a single CPU core. No operation on the data can result in corruption or data loss (hence not using the load/inc/store since the value can change during the inc thus being lost on the store). Since interrupts can only be serviced once the current instruction has completed, the data can never be corrupted, even with unaligned memory.

Once you introduce a second CPU to the system, the volatile keyword won't guard against the data being updated by another CPU at the same time. In the above example, you would need the data to be unaligned to get a potential corruption. The volatile keyword won't prevent potential corruption if the data cannot be handled atomically, for example, if the loop counter was of type long long (64 bits) then it would require two 32 bit operations to update the value, in the middle of which an interrupt can occur and change the data.

So, the volatile keyword is only good for aligned data which is less than or equal to the size of the native registers such that operations are always atomic.

The volatile keyword was conceived to be used with IO operations where the IO would be constantly changing but had a constant address, such as a memory mapped UART device, and the compiler shouldn't keep reusing the first value read from the address.

If you're handling large data or have multiple CPUs then you'll need a higher level (OS) locking system to handle the data access properly.

Wacky answered 16/9, 2008 at 14:41 Comment(3)
This is C++ but the principle applies to C#.Wacky
Eric Lippert writes that volatile in C++ only prevents the compiler from performing some optimisations, while in C# volatile additionally does some communication between the other cores/processors to ensure that the latest value gets read.Annettannetta
He specifically asked about C# and this answer is about C++. It is not at all apparent that the volatile keyword in C# behaves exactly like the volatile keyword in C++.Beaverette
O
48

If you are using .NET 1.1, the volatile keyword is needed when doing double checked locking. Why? Because prior to .NET 2.0, the following scenario could cause a second thread to access an non-null, yet not fully constructed object:

  1. Thread 1 asks if a variable is null. //if(this.foo == null)
  2. Thread 1 determines the variable is null, so enters a lock. //lock(this.bar)
  3. Thread 1 asks AGAIN if the variable is null. //if(this.foo == null)
  4. Thread 1 still determines the variable is null, so it calls a constructor and assigns the value to the variable. //this.foo = new Foo();

Prior to .NET 2.0, this.foo could be assigned the new instance of Foo, before the constructor was finished running. In this case, a second thread could come in (during thread 1's call to Foo's constructor) and experience the following:

  1. Thread 2 asks if variable is null. //if(this.foo == null)
  2. Thread 2 determines the variable is NOT null, so tries to use it. //this.foo.MakeFoo()

Prior to .NET 2.0, you could declare this.foo as being volatile to get around this problem. Since .NET 2.0, you no longer need to use the volatile keyword to accomplish double checked locking.

Wikipedia actually has a good article on Double Checked Locking, and briefly touches on this topic: http://en.wikipedia.org/wiki/Double-checked_locking

Ormazd answered 27/2, 2014 at 20:25 Comment(5)
this is exactly what I see in a legacy code and was wondering about it. that is why I started a deeper research. Thanks!Ribal
I don't understand how would Thread 2 would assign value to foo? Isn't thread 1 locking this.bar and therefore only thread 1 will be able to initialize foo at a givne point in time? I mean, you do check the value after the lock is released again, when anyway it should have the new value from thread 1.Decern
@Decern My understanding is that it is not that Thread2 would assign a value to foo, its that Thread2 would use a not fully initialized foo, even though it is non-null.Cran
@Cran Iam not sure why I phrased it that way - I think I assumed it was a singleton, so all threads would access the object in a similar manner, through the double checked locking - in that case I am not sure how would volatile be necessary.Decern
@Decern I believe what they're saying is that prior to .NET 2.0, in the line this.foo = new Foo();, the compiler was allowed to perform the field assignment before the end of the constructor, as long as the assignment occurred before any instructions from the following line. Such a window between field assignment and constructor completion wouldn't be a problem for a single thread, but if a second thread encountered the double-checked lock during that window, then it could attempt to use the field before the first thread had finished constructing it.Pigmy
K
35

Sometimes, the compiler will optimize a field and use a register to store it. If thread 1 does a write to the field and another thread accesses it, since the update was stored in a register (and not memory), the 2nd thread would get stale data.

You can think of the volatile keyword as saying to the compiler "I want you to store this value in memory". This guarantees that the 2nd thread retrieves the latest value.

Kickstand answered 16/9, 2008 at 13:52 Comment(0)
H
24

From MSDN: The volatile modifier is usually used for a field that is accessed by multiple threads without using the lock statement to serialize access. Using the volatile modifier ensures that one thread retrieves the most up-to-date value written by another thread.

Hetaera answered 16/9, 2008 at 13:45 Comment(2)
This does not seem to be true. From the docs: "On a multiprocessor system, a volatile read operation does not guarantee to obtain the latest value written to that memory location by any processor."Chelate
@Chelate last year I asked a question on GitHub about this exact statement in the docs, that you might find interesting.Chloroprene
P
15

The CLR likes to optimize instructions, so when you access a field in code it might not always access the current value of the field (it might be from the stack, etc). Marking a field as volatile ensures that the current value of the field is accessed by the instruction. This is useful when the value can be modified (in a non-locking scenario) by a concurrent thread in your program or some other code running in the operating system.

You obviously lose some optimization, but it does keep the code more simple.

Pereyra answered 16/9, 2008 at 13:42 Comment(0)
B
7

I found this article by Joydip Kanjilal very helpful:

When you mark an object or a variable as volatile, it becomes a candidate for volatile reads and writes. It should be noted that in C# all memory writes are volatile irrespective of whether you are writing data to a volatile or a non-volatile object. However, the ambiguity happens when you are reading data. When you are reading data that is non-volatile, the executing thread may or may not always get the latest value. If the object is volatile, the thread always gets the most up-to-date value

Blackcap answered 17/11, 2019 at 21:39 Comment(0)
M
7

Simply looking into the official page for volatile keyword you can see an example of typical usage.

public class Worker
{
    public void DoWork()
    {
        bool work = false;
        while (!_shouldStop)
        {
            work = !work; // simulate some work
        }
        Console.WriteLine("Worker thread: terminating gracefully.");
    }
    public void RequestStop()
    {
        _shouldStop = true;
    }
    
    private volatile bool _shouldStop;
}

With the volatile modifier added to the declaration of _shouldStop in place, you'll always get the same results. However, without that modifier on the _shouldStop member, the behavior is unpredictable.

So this is definitely not something downright crazy.

There exists Cache coherence that is responsible for CPU caches consistency.

Also if CPU employs strong memory model (as x86)

As a result, reads and writes of volatile fields require no special instructions on the x86: Ordinary reads and writes (for example, using the MOV instruction) are sufficient.

Example from C# 5.0 specification (chapter 10.5.3)

using System;
using System.Threading;
class Test
{
    public static int result;   
    public static volatile bool finished;
    static void Thread2() {
        result = 143;    
        finished = true; 
    }
    static void Main() {

        finished = false;
        new Thread(new ThreadStart(Thread2)).Start();

        for (;;) {
            if (finished) {
                Console.WriteLine("result = {0}", result);
                return;
            }
        }
    }
}

produces the output: result = 143

If the field finished had not been declared volatile, then it would be permissible for the store to result to be visible to the main thread after the store to finished, and hence for the main thread to read the value 0 from the field result.

Volatile behavior is platform dependent so you should always consider using volatile when needed by case to be sure it satisfies your needs.

Even volatile could not prevent (all kind of) reordering (C# - The C# Memory Model in Theory and Practice, Part 2)

Even though the write to A is volatile and the read from A_Won is also volatile, the fences are both one-directional, and in fact allow this reordering.

So I believe if you want to know when to use volatile (vs lock vs Interlocked) you should get familiar with memory fences (full, half) and needs of a synchronization. Then you get your precious answer yourself for your good.

Morbific answered 4/12, 2020 at 13:49 Comment(0)
E
1

The compiler sometimes changes the order of statements in code to optimize it. Normally this is not a problem in single-threaded environment, but it might be an issue in multi-threaded environment. See following example:

 private static int _flag = 0;
 private static int _value = 0;

 var t1 = Task.Run(() =>
 {
     _value = 10; /* compiler could switch these lines */
     _flag = 5;
 });

 var t2 = Task.Run(() =>
 {
     if (_flag == 5)
     {
         Console.WriteLine("Value: {0}", _value);
     }
 });

If you run t1 and t2, you would expect no output or "Value: 10" as the result. It could be that the compiler switches line inside t1 function. If t2 then executes, it could be that _flag has value of 5, but _value has 0. So expected logic could be broken.

To fix this you can use volatile keyword that you can apply to the field. This statement disables the compiler optimizations so you can force the correct order in you code.

private static volatile int _flag = 0;

You should use volatile only if you really need it, because it disables certain compiler optimizations, it will hurt performance. It's also not supported by all .NET languages (Visual Basic doesn't support it), so it hinders language interoperability.

Entanglement answered 8/2, 2016 at 16:44 Comment(2)
Your example is really bad. The programmer should never have any expectation on the value of _flag in the t2 task based on the fact that t1's code is written first. Written first != executed first. It doesn't matter if the compiler DOES switch those two lines in t1. Even if the compiler didn't switch those statements, your Console.WriteLne in the else branch may still execute, even WITH the volatile keyword on _flag.Methuselah
@jakotheshadows, you are right, I have edited my answer. My main idea was to show that the expected logic could be broken when we run t1 and t2 simultaneouslyEntanglement
C
1

So to sum up all this, the correct answer to the question is: If your code is running in the 2.0 runtime or later, the volatile keyword is almost never needed and does more harm than good if used unnecessarily. I.E. Don't ever use it. BUT in earlier versions of the runtime, it IS needed for proper double check locking on static fields. Specifically static fields whose class has static class initialization code.

Cormac answered 14/12, 2018 at 12:32 Comment(0)
C
1

Here is an example where the volatile keyword is used effectively for its intended purpose. Let's say that we want to implement a rudimentary version of the Task<TResult> class. Our class contains only two fields, _completed and _result, and supports only two operations: setting the _result and reading it only if it's _completed. Let's see a lock-free implementation of this simple type:

public class MyTask<TResult>
{
    private volatile bool _completed;
    private TResult _result;

    public void UnsafeSetResult(TResult result)
    {
        _result = result;
        _completed = true;
    }

    public bool TryGetResult(out TResult result)
    {
        if (_completed)
        {
            result = _result;
            return true;
        }
        result = default;
        return false;
    }
}

The UnsafeSetResult method is named "unsafe", because it should be called only once during the whole lifetime of a MyTask<TResult> instance. We will deal with this limitation at the end of this answer, but for now let's assume that this rule can be enforced simply by the structure of our application. For example we may have a single dedicated thread that is responsible for creating the MyTask<TResult> objects, and calling the UnsafeSetResult on them.

The important question that we have to answer is: why the TryGetResult works correcty in a multithreaded environment? What prevents a thread that calls the TryGetResult, to receive a torn TResult value? The answer lies on the _completed field being declared as volatile, and on the order that the _completed and _results fields are assigned and read. We want to ensure that no thread will attempt to read the _result, before its value has been completely stored in the field. Notice that we impose no limitation on the TResult generic parameter, so it is entirely possible to be a large struct, like a decimal, an Int128, a ValueTuple<long,long,long,long> etc. If we are not careful, a thread might read a half-writen _result value, with half of its bytes still uninitialized. This is called "tearing", and it's the catastrophe that we want to prevent.

We can ensure that tearing will not occur by setting the _completed to true after we assign the _result, and reading the _result after we have confirmed that the _completed is true. The volatile keyword on the _completed field ensures that neither the C# compiler, nor the .NET Jitter will emit CPU instructions that access/modify the computer memory in a different order. In case you didn't know, the C# compiler and the .NET Jitter, as well the CPU processor, are allowed to reorder the instructions of a program, provided that this reordering does not affect the program's behavior when running on a single thread.

Let's see precisely what effect the volatile has on the UnsafeSetResult method:

It inserts a memory barrier that prevents the processor from reordering memory operations as follows: If a read or write appears before this method in the code, the processor cannot move it after this method.

In other words the _result = result; cannot be moved after the _completed = true;.

Now let's see precisely what effect the volatile has on the TryGetResult method:

It inserts a memory barrier that prevents the processor from reordering memory operations as follows: If a read or write appears after this method in the code, the processor cannot move it before this method.

In other words the result = _result; cannot be moved before the if (_completed).

As you can see we need memory barriers in both methods. If we remove any one of the two memory barriers, the correctness of our program is no longer guaranteed.

Finally let's see how we could implement a thread-safe version of the UnsafeSetResult. We'll need a transitional "reserved" state, beyond the false/true values of a bool field. So we'll use a volatile int _state field instead:

public class MyTask<TResult>
{
    private volatile int _state; // 0:incomplete, 1:reserved, 2:completed
    private TResult _result;

    public bool TrySetResult(TResult result)
    {
        if (Interlocked.CompareExchange(ref _state, 1, 0) == 0)
        {
            _result = result;
            _state = 2;
            return true;
        }
        return false;
    }

    public bool TryGetResult(out TResult result)
    {
        if (_state == 2)
        {
            result = _result;
            return true;
        }
        result = default;
        return false;
    }
}

The actual Task<TResult> class has a similar internal volatile int m_stateFlags; field (source code), that has one of its bits flipped atomically (CompletionReserved, source code) before assigning the internal TResult? m_result; field.

Chloroprene answered 4/11, 2023 at 18:26 Comment(8)
In other words, volatile gives you acquire / release semantics for loads / stores respectively. preshing.com/20120913/acquire-and-release-semantics.Cozmo
Also note that out-of-order execution can happen without memory reordering. (e.g. x86, where the memory model is program order + a store buffer with store forwarding, so program-order commit of stores from the store buffer, even though stores could exec (write their data to the store buffer) in any order.) And memory reordering can happen on in-order CPUs, StoreLoad via the store buffer like on x86, but also LoadLoad via hit-under-miss caches, etc. How is load->store reordering possible with in-order commit?Cozmo
A lot of discussion about C# memory ordering does sloppily use "execute" to describe when stores commit data to L1d cache (making it globally visible), but that's not accurate in computer architecture terms.Cozmo
@PeterCordes thanks for the comments. I wrote this answer because I think it's interesting how the volatile can be applied on a field A, with the intention to protect a non-volatile field B. I used almost exclusively the terminology and official documentation available to C# developers, in order to not alienate the audience. :-)Chloroprene
@PeterCordes should I change the Wiki link in the answer from Out-of-order execution to Memory ordering? To be honest I am not aware of the nuances. I just found a link that looked relevant. :-)Chloroprene
Yeah, the en.wikipedia.org/wiki/Memory_ordering wiki link is probably better. Or perhaps preshing.com/20120710/… which is a useful mental model of memory ordering (local reordering of accesses to coherent shared state) which I think matches the C# memory model well.Cozmo
@PeterCordes OK, I replaced it with the "Memory ordering" Wikipedia link. Jeff Preshing's blog post could be more informative, but also more C++ centric, and might scare the audience here. :-)Chloroprene
Totally fair. I'd still highly recommend it to anyone reading these comments.Cozmo
C
-1

From the documentation:

On a multiprocessor system, a volatile read operation does not guarantee to obtain the latest value written to that memory location by any processor.

This to me sounds like you are NOT guaranteed to "get the most up-to-date value" as some of the other answers say. From the specification, the only guarantees seem to be:

volatile int v;

x = 5;
print(y);
...
v = 3; // previous read/write operations on this thread cannot be moved after this (even if there are no dependencies)

...

print(v); // following read/write operations on this thread cannot be moved before this (even if there are no dependencies)
x = 5;
print(y);
...

Make of this what you will. This for example means that if thread 1 does multiple volatile writes:

// thread 1
v1 = 1
...
v2 = 2

Then thread 2 will be guaranteed to see v1 being set to 1 and THEN v2 being set to 2. But for example inter-thread ordering is not guaranteed. Example from Albahari:

// thread 1
volatile int v1 = 0;
v1 = 1;
print(v2)

// thread 2
volatile int v2 = 0;
v2 = 1;
print(v1);

You may see:

0
0

printed since reads can be moved before the writes.

The specification also does not say anything about caching. But the docs do mention it in a note:

Volatile reads and writes ensure that a value is read or written to memory and not cached (for example, in a processor register).

Which, if true, means you could use them for synchronization between threads like this:

// main thread
v = false // at t = 0s

// worker thread 1
...
// value is written directly to memory
v = true // at t = 1s
...

// worker thread 2
...
// latest value from memory is retrieved
if (v)
{
  print("this is guaranteed to be printed after t = 1s")
}
...

Which seems to be contradicting the earlier note that you cannot assume you are getting the latest value.

IMO it seems best to avoid volatile in everyday programming and only use it carefully for the case where its specific weak requirements provide some performance optimization over stronger synchronization mechanism like locks.

Chelate answered 21/8, 2023 at 8:50 Comment(4)
I don't think that your answer addresses what is asked in the question. The question asks "which problems does volatile solve and which it doesn't?". Your answer basically is "I don't know of any problem that is solved by volatile, so I discourage you to use it, unless you need performance and you are careful." That's not helpful. We have already Eric Lippert, through Ohad Schneider's answer, to discourage people from using volatile. We don't need more panic and negativity. People should make decisions based on knowledge and solid information, not in a state of confusion.Chloroprene
@TheodorZoulias The answer is to not use it because the documentation has conflicting statements unless you understand the optimizations. It does not make sense to give concrete use cases when there is a chance those will be broken once the documentation / implementation fixes those contradictions one way or the other. Pretty much every other answer in this thread says you will get the latest value when reading a volatile. My answer adds to the discussion by pointing out the documentation says otherwise.Chelate
Would you also advise Microsoft to stop using the volatile, like here (the _state of the CancellationTokenSource), until they fix the inconsistencies in the docs? Otherwise, if Microsoft is allowed to use it, why not everyone else?Chloroprene
"Pretty much every other answer in this thread says you will get the latest value when reading a volatile." -- That's because at some point in history this was written in the docs. Computer architectures have become more complicated over the years, making the concept of "latest value" very fuzzy, so the phrase was removed from the docs. If you want to know the latest value of a shared field on a multiprocessor system, using the lock won't give you a more up-to-date value that using the volatile. The natural latency of the system will be imposed on you no matter what.Chloroprene

© 2022 - 2024 — McMap. All rights reserved.