What does the volatile
keyword do? In C++ what problem does it solve?
In my case, I have never knowingly needed it.
What does the volatile
keyword do? In C++ what problem does it solve?
In my case, I have never knowingly needed it.
volatile
is needed if you are reading from a spot in memory that, say, a completely separate process/device/whatever may write to.
I used to work with dual-port ram in a multiprocessor system in straight C. We used a hardware managed 16 bit value as a semaphore to know when the other guy was done. Essentially we did this:
void waitForSemaphore()
{
volatile uint16_t* semPtr = WELL_KNOWN_SEM_ADDR;/*well known address to my semaphore*/
while ((*semPtr) != IS_OK_FOR_ME_TO_PROCEED);
}
Without volatile
, the optimizer sees the loop as useless (The guy never sets the value! He's nuts, get rid of that code!) and my code would proceed without having acquired the semaphore, causing problems later on.
uint16_t* volatile semPtr
was written instead? This should mark the pointer as volatile (instead of the value pointed to), so that checks to the pointer itself, e.g. semPtr == SOME_ADDR
may not be optimized. This however implies a volatile pointed value again as well. No? –
Crompton *semPtr
? –
Covenantor std::atomic
, your only option was to hack things up yourself with inline asm or library function calls for any necessary barriers. You could wrap the volatile access in a read_once()
function or macro like the Linux kernel does, but it would still boil down to this to get the asm you want on any sane implementation where an aligned volatile uint16_t
can be read/written atomically. (i.e. on most specific platforms, the actual behaviour you get from this is well-defined.) –
Harumscarum atomic<uint16_t>
or volatile atomic<uint16_t>
to do acquire-loads. In this case the compiler couldn't hoist the load because that would make the loop infinite, so volatile
isn't needed here. Can and does the compiler optimize out two atomic loads?) –
Harumscarum volatile
is needed under some conditions. But this is not so. For example, consider a platform that provides a native type, say atomic_int
, that is documented to be suitable for reading from memory that completely separate devices might write to. Certainly volatile
would not be needed on that platform. Because this is very often the case, in practice, volatile
is only very rarely needed, even when you need this behavior. –
Clinandrium atomic_int
do? In my world, we mainly use "atomic" to describe a series of operations that need to be performed sequentially w/o being interrupted by other operations. [cont...] –
Northing atomic_int
that is documented to behave exactly the same as volatile int
does. In that case, you would not need to use volatile
on that platform since you could use atomic_int
. But this answer says volatile
is needed. That is wrong. Most platforms offer things like atomic_int
that have guaranteed semantics without needing to use volatile
. –
Clinandrium volatile int
, as if there was a typedef volatile int atomic_int
, and then say the use of volatile
is not necessary? If so, then the same argument could be used to say that if the system provides a type called whole
that behaves like int
then using int
is not necessary???! Also, I think that in my world, this won't be an appropriate use of the word atomic
, as described above. Or did I completely missed your point? –
Northing volatile
is necessary because other things can provide the guarantees and on every realistic platform, there are in fact other things that provide those guarantees and volatile
is not actually used. It's very misleading to say something is "necessary" (in fact, outright wrong) when it is not even the most common solution. –
Clinandrium volatile
is needed when developing embedded systems or device drivers, where you need to read or write a memory-mapped hardware device. The contents of a particular device register could change at any time, so you need the volatile
keyword to ensure that such accesses aren't optimised away by the compiler.
Some processors have floating point registers that have more than 64 bits of precision (eg. 32-bit x86 without SSE, see Peter's comment). That way, if you run several operations on double-precision numbers, you actually get a higher-precision answer than if you were to truncate each intermediate result to 64 bits.
This is usually great, but it means that depending on how the compiler assigned registers and did optimizations you'll have different results for the exact same operations on the exact same inputs. If you need consistency then you can force each operation to go back to memory by using the volatile keyword.
It's also useful for some algorithms that make no algebraic sense but reduce floating point error, such as Kahan summation. Algebraicly it's a nop, so it will often get incorrectly optimized out unless some intermediate variables are volatile.
volatile double
instead of just double
, so to ensure that it is truncated from FPU precision to 64-bit (RAM) precision before continuing further computations. The results were substantially different because of a further exaggeration of the floating-point error. –
Fenestra g++ -mfpmath=sse
to use it for 32-bit x86 as well. You can use gcc -ffloat-store
to force rounding everywhere even when using x87, or you can set the x87 precision to 53-bit mantissa: randomascii.wordpress.com/2012/03/21/…. –
Harumscarum volatile
to force rounding in a few specific places without losing the benefits everywhere. –
Harumscarum volatile
. If it's a microcoded instruction like x87 fsin
then you could in theory have extra precision kept between uops. But no it wouldn't be resumable, if it's interruptible it would abort. –
Harumscarum f0
internally keeping more than 64-bit precision across a chain of FP instructions, so volatile
to force store/reload would have an effect. Most FPUs (other than 387) only provide the IEEE Basic operations +-/ sqrt, which are required to produce correctly-rounded results (error <= 0.5ulp). So unless you keep extra internal precision between instructions, the results are fully specified. Fun fact: AMD *does keep extra internal data between FP instructions, but not extra precision; probably something like exponent/mantissa unpacking. –
Harumscarum double
to intermediate C++ values. Volatile is very reliable for that purpose: double x,y,z,a; volatile double r; r=y*z; a=x+r;
(Ppl say that a cast has the same effect: x+(double)(y*z)
but that relies on the compiler front end for the conversion to effective double precision of an expression of static type double, which was unreliable on at least one popular compiler.) –
Tallula y*z
. Just like FP_CONTRACT, gcc optimizes across statements by default, not just within expressions with rounding to actual double
forced by assignments and casts, even though FLT_EVAL_METHOD = 2
says it should. That would be slow. But again, only a problem with >64-bit registers. –
Harumscarum From a "Volatile as a promise" article by Dan Saks:
(...) a volatile object is one whose value might change spontaneously. That is, when you declare an object to be volatile, you're telling the compiler that the object might change state even though no statements in the program appear to change it."
Here are links to three of his articles regarding the volatile
keyword:
You MUST use volatile when implementing lock-free data structures. Otherwise the compiler is free to optimize access to the variable, which will change the semantics.
To put it another way, volatile tells the compiler that accesses to this variable must correspond to a physical memory read/write operation.
For example, this is how InterlockedIncrement is declared in the Win32 API:
LONG __cdecl InterlockedIncrement(
__inout LONG volatile *Addend
);
std::atomic<LONG>
so you can write lockless code more safely without problems of having pure loads / pure stores optimized away, or reordered, or whatever else. –
Harumscarum A large application that I used to work on in the early 1990s contained C-based exception handling using setjmp and longjmp. The volatile keyword was necessary on variables whose values needed to be preserved in the block of code that served as the "catch" clause, lest those vars be stored in registers and wiped out by the longjmp.
In Standard C, one of the places to use volatile
is with a signal handler. In fact, in Standard C, all you can safely do in a signal handler is modify a volatile sig_atomic_t
variable, or exit quickly. Indeed, AFAIK, it is the only place in Standard C that the use of volatile
is required to avoid undefined behaviour.
ISO/IEC 9899:2011 §7.14.1.1 The
signal
function¶5 If the signal occurs other than as the result of calling the
abort
orraise
function, the behavior is undefined if the signal handler refers to any object with static or thread storage duration that is not a lock-free atomic object other than by assigning a value to an object declared asvolatile sig_atomic_t
, or the signal handler calls any function in the standard library other than theabort
function, the_Exit
function, thequick_exit
function, or thesignal
function with the first argument equal to the signal number corresponding to the signal that caused the invocation of the handler. Furthermore, if such a call to thesignal
function results in a SIG_ERR return, the value oferrno
is indeterminate.252)252) If any signal is generated by an asynchronous signal handler, the behavior is undefined.
That means that in Standard C, you can write:
static volatile sig_atomic_t sig_num = 0;
static void sig_handler(int signum)
{
signal(signum, sig_handler);
sig_num = signum;
}
and not much else.
POSIX is a lot more lenient about what you can do in a signal handler, but there are still limitations (and one of the limitations is that the Standard I/O library — printf()
et al — cannot be used safely).
Developing for an embedded, I have a loop that checks on a variable that can be changed in an interrupt handler. Without "volatile", the loop becomes a noop - as far as the compiler can tell, the variable never changes, so it optimizes the check away.
Same thing would apply to a variable that may be changed in a different thread in a more traditional environment, but there we often do synchronization calls, so compiler is not so free with optimization.
I've used it in debug builds when the compiler insists on optimizing away a variable that I want to be able to see as I step through code.
Besides using it as intended, volatile is used in (template) metaprogramming. It can be used to prevent accidental overloading, as the volatile attribute (like const) takes part in overload resolution.
template <typename T>
class Foo {
std::enable_if_t<sizeof(T)==4, void> f(T& t)
{ std::cout << 1 << t; }
void f(T volatile& t)
{ std::cout << 2 << const_cast<T&>(t); }
void bar() { T t; f(t); }
};
This is legal; both overloads are potentially callable and do almost the same. The cast in the volatile
overload is legal as we know bar won't pass a non-volatile T
anyway. The volatile
version is strictly worse, though, so never chosen in overload resolution if the non-volatile f
is available.
Note that the code never actually depends on volatile
memory access.
The volatile
keyword is intended to prevent the compiler from applying any optimisations on objects that can change in ways that cannot be determined by the compiler.
Objects declared as volatile
are omitted from optimisation because their values can be changed by code outside the scope of current code at any time. The system always reads the current value of a volatile
object from the memory location rather than keeping its value in temporary register at the point it is requested, even if a previous instruction asked for a value from the same object.
Consider the following cases
1) Global variables modified by an interrupt service routine outside the scope.
2) Global variables within a multi-threaded application.
If we do not use volatile qualifier, the following problems may arise
1) Code may not work as expected when optimisation is turned on.
2) Code may not work as expected when interrupts are enabled and used.
Volatile: A programmer’s best friend
https://en.wikipedia.org/wiki/Volatile_(computer_programming)
Other answers already mention avoiding some optimization in order to:
Volatile is essential whenever you need a value to appear to come from the outside and be unpredictable and avoid compiler optimizations based on a value being known, and when a result isn't actually used but you need it to be computed, or it's used but you want to compute it several times for a benchmark, and you need the computations to start and end at precise points.
A volatile read is like an input operation (like scanf
or a use of cin
): the value seems to come from the outside of the program, so any computation that has a dependency on the value needs to start after it.
A volatile write is like an output operation (like printf
or a use of cout
): the value seems to be communicated outside of the program, so if the value depends on a computation, it needs to be finished before.
So a pair of volatile read/write can be used to tame benchmarks and make time measurement meaningful.
Without volatile, your computation could be started by the compiler before, as nothing would prevent reordering of computations with functions such as time measurement.
All answers are excellent. But on the top of that, I would like to share an example.
Below is a little cpp program:
#include <iostream>
int x;
int main(){
char buf[50];
x = 8;
if(x == 8)
printf("x is 8\n");
else
sprintf(buf, "x is not 8\n");
x=1000;
while(x > 5)
x--;
return 0;
}
Now, lets generate the assembly of the above code (and I will paste only that portions of the assembly which relevant here):
The command to generate assembly:
g++ -S -O3 -c -fverbose-asm -Wa,-adhln assembly.cpp
And the assembly:
main:
.LFB1594:
subq $40, %rsp #,
.seh_stackalloc 40
.seh_endprologue
# assembly.cpp:5: int main(){
call __main #
# assembly.cpp:10: printf("x is 8\n");
leaq .LC0(%rip), %rcx #,
# assembly.cpp:7: x = 8;
movl $8, x(%rip) #, x
# assembly.cpp:10: printf("x is 8\n");
call _ZL6printfPKcz.constprop.0 #
# assembly.cpp:18: }
xorl %eax, %eax #
movl $5, x(%rip) #, x
addq $40, %rsp #,
ret
.seh_endproc
.p2align 4,,15
.def _GLOBAL__sub_I_x; .scl 3; .type 32; .endef
.seh_proc _GLOBAL__sub_I_x
You can see in the assembly that the assembly code was not generated for sprintf
because the compiler assumed that x
will not change outside of the program. And same is the case with the while
loop. while
loop was altogether removed due to the optimization because compiler saw it as a useless code and thus directly assigned 5
to x
(see movl $5, x(%rip)
).
The problem occurs when what if an external process/ hardware would change the value of x
somewhere between x = 8;
and if(x == 8)
. We would expect else
block to work but unfortunately the compiler has trimmed out that part.
Now, in order to solve this, in the assembly.cpp
, let us change int x;
to volatile int x;
and quickly see the assembly code generated:
main:
.LFB1594:
subq $104, %rsp #,
.seh_stackalloc 104
.seh_endprologue
# assembly.cpp:5: int main(){
call __main #
# assembly.cpp:7: x = 8;
movl $8, x(%rip) #, x
# assembly.cpp:9: if(x == 8)
movl x(%rip), %eax # x, x.1_1
# assembly.cpp:9: if(x == 8)
cmpl $8, %eax #, x.1_1
je .L11 #,
# assembly.cpp:12: sprintf(buf, "x is not 8\n");
leaq 32(%rsp), %rcx #, tmp93
leaq .LC0(%rip), %rdx #,
call _ZL7sprintfPcPKcz.constprop.0 #
.L7:
# assembly.cpp:14: x=1000;
movl $1000, x(%rip) #, x
# assembly.cpp:15: while(x > 5)
movl x(%rip), %eax # x, x.3_15
cmpl $5, %eax #, x.3_15
jle .L8 #,
.p2align 4,,10
.L9:
# assembly.cpp:16: x--;
movl x(%rip), %eax # x, x.4_3
subl $1, %eax #, _4
movl %eax, x(%rip) # _4, x
# assembly.cpp:15: while(x > 5)
movl x(%rip), %eax # x, x.3_2
cmpl $5, %eax #, x.3_2
jg .L9 #,
.L8:
# assembly.cpp:18: }
xorl %eax, %eax #
addq $104, %rsp #,
ret
.L11:
# assembly.cpp:10: printf("x is 8\n");
leaq .LC1(%rip), %rcx #,
call _ZL6printfPKcz.constprop.1 #
jmp .L7 #
.seh_endproc
.p2align 4,,15
.def _GLOBAL__sub_I_x; .scl 3; .type 32; .endef
.seh_proc _GLOBAL__sub_I_x
Here you can see that the assembly codes for sprintf
, printf
and while
loop were generated. The advantage is that if the x
variable is changed by some external program or hardware, sprintf
part of the code will be executed. And similarly while
loop can be used for busy waiting now.
Beside the fact that the volatile keyword is used for telling the compiler not to optimize the access to some variable (that can be modified by a thread or an interrupt routine), it can be also used to remove some compiler bugs -- YES it can be ---.
For example I worked on an embedded platform were the compiler was making some wrong assuptions regarding a value of a variable. If the code wasn't optimized the program would run ok. With optimizations (which were really needed because it was a critical routine) the code wouldn't work correctly. The only solution (though not very correct) was to declare the 'faulty' variable as volatile.
Your program seems to work even without volatile
keyword? Perhaps this is the reason:
As mentioned previously the volatile
keyword helps for cases like
volatile int* p = ...; // point to some memory
while( *p!=0 ) {} // loop until the memory becomes zero
But there seems to be almost no effect once an external or non-inline function is being called. E.g.:
while( *p!=0 ) { g(); }
Then with or without volatile
almost the same result is generated.
As long as g() can be completely inlined, the compiler can see everything that's going on and can therefore optimize. But when the program makes a call to a place where the compiler can't see what's going on, it isn't safe for the compiler to make any assumptions any more. Hence the compiler will generate code that always reads from memory directly.
But beware of the day, when your function g() becomes inline (either due to explicit changes or due to compiler/linker cleverness) then your code might break if you forgot the volatile
keyword!
Therefore I recommend to add the volatile
keyword even if your program seems to work without. It makes the intention clearer and more robust in respect to future changes.
volatile
qualified function pointer: void (* volatile fun_ptr)() = fun; fun_ptr();
–
Tallula In the early days of C, compilers would interpret all actions that read and write lvalues as memory operations, to be performed in the same sequence as the reads and writes appeared in the code. Efficiency could be greatly improved in many cases if compilers were given a certain amount of freedom to re-order and consolidate operations, but there was a problem with this. Even though operations were often specified in a certain order merely because it was necessary to specify them in some order, and thus the programmer picked one of many equally-good alternatives, that wasn't always the case. Sometimes it would be important that certain operations occur in a particular sequence.
Exactly which details of sequencing are important will vary depending upon the target platform and application field. Rather than provide particularly detailed control, the Standard opted for a simple model: if a sequence of accesses are done with lvalues that are not qualified volatile
, a compiler may reorder and consolidate them as it sees fit. If an action is done with a volatile
-qualified lvalue, a quality implementation should offer whatever additional ordering guarantees might be required by code targeting its intended platform and application field, without requiring that programmers use non-standard syntax.
Unfortunately, rather than identify what guarantees programmers would need, many compilers have opted instead to offer the bare minimum guarantees mandated by the Standard. This makes volatile
much less useful than it should be. On gcc or clang, for example, a programmer needing to implement a basic "hand-off mutex" [one where a task that has acquired and released a mutex won't do so again until the other task has done so] must do one of four things:
Put the acquisition and release of the mutex in a function that the compiler cannot inline, and to which it cannot apply Whole Program Optimization.
Qualify all the objects guarded by the mutex as volatile
--something which shouldn't be necessary if all accesses occur after acquiring the mutex and before releasing it.
Use optimization level 0 to force the compiler to generate code as though all objects that aren't qualified register
are volatile
.
Use gcc-specific directives.
By contrast, when using a higher-quality compiler which is more suitable for systems programming, such as icc, one would have another option:
volatile
-qualified write gets performed everyplace an acquire or release is needed.Acquiring a basic "hand-off mutex" requires a volatile
read (to see if it's ready), and shouldn't require a volatile
write as well (the other side won't try to re-acquire it until it's handed back) but having to perform a meaningless volatile
write is still better than any of the options available under gcc or clang.
I would like to quote Herb Sutter's words from his GotW #95, which can help to understand the meaning of the volatile
variables:
C++
volatile
variables (which have no analog in languages likeC#
andJava
) are always beyond the scope of this and any other article about the memory model and synchronization. That’s becauseC++
volatile
variables aren’t about threads or communication at all and don’t interact with those things. Rather, aC++
volatile
variable should be viewed as portal into a different universe beyond the language — a memory location that by definition does not obey the language’s memory model because that memory location is accessed by hardware (e.g., written to by a daughter card), have more than one address, or is otherwise “strange” and beyond the language. SoC++
volatile
variables are universally an exception to every guideline about synchronization because are always inherently “racy” and unsynchronizable using the normal tools (mutexes, atomics, etc.) and more generally exist outside all normal of the language and compiler including that they generally cannot be optimized by the compiler (because the compiler isn’t allowed to know their semantics; avolatile int vi;
may not behave anything like a normalint
, and you can’t even assume that code likevi = 5; int read_back = vi;
is guaranteed to result inread_back == 5
, or that code likeint i = vi; int j = vi;
that reads vi twice will result ini == j
which will not be true ifvi
is a hardware counter for example).
One use I should remind you is, in the signal handler function, if you want to access/modify a global variable (for example, mark it as exit = true) you have to declare that variable as 'volatile'.
© 2022 - 2024 — McMap. All rights reserved.
volatile
can be used effectively, put together in pretty layman terms. Link : publications.gbdirect.co.uk/c_book/chapter8/… – Nunes