Volatile Pointer to Non Volatile Data
Asked Answered
L

2

1

Suppose I have the following declaration:

int* volatile x;

I believe that this defines a volatile pointer "normal" variable.

To me this could mean one of two things:

First Guess

The pointer can change, but the number will not change without notice. This means that some other thread (that the compiler doesn't know about) can change the pointer, but if the old pointer was pointing to a "12" then the new pointer (the new value of the pointer, because the thread changes it) would point to another "12".

To me this seems fairly useless, and I would assume that this is not what the real operation is.

Second Guess

The pointer can change, and thus if the pointer changes, the compiler must reload the value in the pointer before using it. But if it verifies that the pointer did not change (with an added check), it can then assume that the value it points to remained the same also.

So my question is this:

What does declaring a volatile pointer to non volatile data actually do?

Les answered 4/8, 2016 at 23:17 Comment(22)
The volatile specifier doesn't have anything to do with threads.Backscratcher
@DavidSchwartz It tells the compiler that it doesn't know how the variable is going to modified (ie it could be modified by a thread it does not know about.....I guess that I was not clear what I ment in the question)Les
I imagine this is up to the optimizer, but it would seem legal for the compiler to not reload the value if the pointer didn't change. In practice it seems like that would be more work then just reloading the pointed to value so the optimizer would probably just reload everything.Tidings
volatile just means "its observable behavior to read and write this value". So any mention of x means "you must read the value of x, and cannot assume you know it/you must write the value of x, and cannot assume it's pointless".Stimulant
@Les No, it doesn't. It has nothing to do with threads.Backscratcher
@DavidSchwartz Unless the compiler doesn't know the thread exists (ie Another program that is changing memory in this program's ram)Les
@Les The compiler typically has no idea whether threads exist or not, I don't know of any system where it would know one way or the other. But, regardless, the language certainly doesn't say that volatile has something to do with the compiler's knowledge of threads.Backscratcher
@DavidSchwartz I know.....that was just an example of its use.....other examples are hardware interrupts, hardware registers (like DMAs, pin controlls, etc), shared memory.......the sky is the limitLes
@Les The point is that threads is not one of them. The volatile specifier doesn't have anything to do with threads.Backscratcher
@DavidSchwartz Obligatory: cxx.isvolatileusefulwiththreads.comAspidistra
@DavidSchwartz I have never actually used threads in c++, I guess I just assumed that they would be similar to interrupts, in micro controllersLes
all the comments are referring to the fact that volatile cannot be used to address any of the well known requirements for writing thread safe code. It doesnt mean that volatile doesnt change the behavior of multi-threaded code in some cases, just that the changes are not necessarily what you expectDebate
@DavidSchwartz Are you saying that volatile is never usable with threads, ever? On any existing architecture?Lelia
@Lelia The only time volatile is useful with threads is where the relevant language, threading, or compiler documentation says it has some defined semantics. Otherwise, you're just assuming it will continue to do what you want because it happened to do what you want when you tried it. All sensible threading standard provide guaranteed ways to get whatever semantics you need and you should use those because they're guaranteed.Backscratcher
@DavidSchwartz Are you saying that volatile is not in practice guaranteed to produce consume semantic on CPU that have consume?Lelia
@Lelia Yes. In fact, I would say that "in practice guaranteed" is an oxymoron.Backscratcher
@DavidSchwartz What plausible semantics attributed to volatile makes it possible to break consume semantics? On which CPU?Lelia
@Lelia If you are asking C and C++ questions and you have to even think about details about CPU implementations, you are already doing something incredibly platform-specific. I will concede that volatile might have some platform-specific semantics that might be useful on some platforms. Too many times, I've seen code break horribly because assumptions were made about what future compilers or CPUs would be able to optimize. We need to decide to stop making that mistake sooner or later. I will not accept your invitation to make that mistake.Backscratcher
@DavidSchwartz How is it a mistake to assume that no future C/C++ optimizer will ever assume that a read of a volatile variable gives a predictable value? That would defeat the whole point.Lelia
@Lelia Imagine if there's a future CPU where providing those predictable values has a huge performance cost but providing all the semantics actually required to be supported by the standard (such as volatile std::sig_atomic_t) had minimal performance cost. Wouldn't sensible compilers for that platform implement only what the standard actually requires? (And the worst case scenario imaginable would be if compiler writers have no choice but to make the performance poor because people listened to your advice and relied on behavior wisely not guaranteed by the standard.)Backscratcher
@DavidSchwartz There is no market for such CPU.Lelia
At one time, people actually did argue that there was no market for CPUs that executed memory operations out of order. I prefer to learn from past mistakes than repeat them.Backscratcher
D
2

int *volatile x; declares a volatile pointer to a non-volatile int.

Whenever the pointer is accessed, the volatile qualifier guarantees that its value (the value of the pointer) is re-read from memory.

Since the pointed-to int is non-volatile, the compiler is allowed to reuse a previously cached value at the address pointed to by the current value of the pointer. Technically this is allowed regardless of whether the pointer has changed or not, as long as there exists a cached value originally retrieved from the current address.


[ EDIT ] To address @DavidSchwartz's comment, I should note that "re-read from memory" is a (not pedantically precise, but AFAIK commonly used) shorthand for "as if it were re-read from memory in the abstract machine".

For example, C11 draft N1570 6.7.3/7 says:

An object that has volatile-qualified type may be modified in ways unknown to the implementation or have other unknown side effects. Therefore any expression referring to such an object shall be evaluated strictly according to the rules of the abstract machine, as described in 5.1.2.3. Furthermore, at every sequence point the value last stored in the object shall agree with that prescribed by the abstract machine, except as modified by the unknown factors mentioned previously (134). What constitutes an access to an object that has volatile-qualified type is implementation-defined.

The same draft has a footnote for 6.5.16/3 (assignment operators):

The implementation is permitted to read the object to determine the value but is not required to, even when the object has volatile-qualified type

So in the end volatile does not require a physical memory read, but the observable behavior of a compliant implementation must be as if one was made regardless.

Dougall answered 4/8, 2016 at 23:30 Comment(15)
This is a very common myth. The volatile qualifier does not require values to be re-read from memory. If it did, it would result in absurdly huge performance drops for no reason.Backscratcher
@DavidSchwartz Huh? That's exactly what it means. See 5.1.2.3/6 in C11. (And yes, it may cause measurable performance drop compared to non-volatile use)Lanthanide
The CPU must issue a read (or write) but without a memory fence there is no guarantee that different cores will see the same value.Bosch
Volatile is not common to see used. It's used in places such as drivers using memory mapping to read values from hardware, where the compiler optimizing out multiple reads impacts operation. Likewise when waiting on DMA to finish a write by polling the write location.Mcfarland
@Lanthanide Nowhere does that say anything has to be re-read from memory. And modern systems do not re-read from memory on accesses to volatiles.Backscratcher
@DavidSchwartz it says "Accesses to volatile objects are evaluated strictly according to the rules of the abstract machine.", and the rules of the abstract machine are that reading a variable reads from the storage area assigned to that variable. Any system that doesn't respect this rule is non-conforming.Lanthanide
@Lanthanide Either "storage area assigned to that variable" means wherever it happens to be stored now or it means main memory. Either way, the statement is false. If it means wherever it happens to be stored now, then that would permit a read from a register if that were where the variable were stored then. If it means main memory, that would mean you'd read from main memory even if the variable were currently stored in, say, another core's L2 cache, which would provide nonsensical results. So, no, you still have it wrong.Backscratcher
@DavidSchwartz in the abstract machine there is assigned a storage location, and the quote in my previous comment says that the implementation must implement that. Nothing to do with registers, main memory, cache etc. Perhaps you're trying to say that the "common misconception" is compilers being bugged w.r.t the implementation of volatile. (NB. Not replying further, this discussion isn't appropriate for comments; make or link to a question about the topic)Lanthanide
@Lanthanide So are you still saying it must be re-read from memory or not?Backscratcher
@DavidSchwartz: Did you read the defect report regarding volatile (DR476) for C11?Osteoma
@ash: Considering that C is widely used in embedded devices and those are the vast majority of architectures, volatile very well is widely used. It just does not guarantee memory consistency, that has to be guaranteed by the heardware (e.g. by having the memory area uncached/unshared/strictly ordered). Polling some memory location for DMA completion is a bad idea. The DMA hgas to guarantee it has completed. PCI(e) provides shecial accesses for that which should be used by DMA controllers. Not that there are no broken devices in the field...Osteoma
@Olaf And that clearly permits exactly the types of optimizations that would prevent re-reading values from memory where the implementation knows that has no needed side-effects. So, again, the claim that re-reading from memory is required is false.Backscratcher
Erm - yes. But with the given information, I don't see reason to assume we have such a scenario. Things may be different for an automatic variable where no pointer to it is passed (volatile-correctly).Osteoma
Your new change makes this answer still completely misleading. You say, "but the observable behavior of a compliant implementation must be as if one was made regardless" which you already agreed was NOT true. Since we agree it might modify cache, it might modify memory, it might modify a register, how the heck do you know what to observe? What if another thread observes memory and the change was made to a register? (And you can see this in current implementations where memory fences are needed to make the change observable TO OTHER THREADS and they are not emitted in the code.)Backscratcher
@DavidSchwartz Sorry, but you are confusing the abstract machine rules, which are always binding, with optimizations that compliant implementations are allowed to perform based on certain intimate knowledge of the real machine. If you dispute that, it would be more productive to post a separate question spelling it out.Dougall
T
-2

The volatile means that the value of the pointer (i.e., the memory location that it points to) can change; consequently, the compiler must ensure that the various caches have the same value for that pointer or load the pointer from memory for every read and write it to memory for every write.

The volatile says nothing about the pointed-to value, however. So it can change and may have different values in different threads.

Thoughtless answered 4/8, 2016 at 23:34 Comment(19)
"...various caches" this is wrong. You need to use the appropriate memory fence for that.Bosch
Can you clarify what you mean by your second paragraph?Lanthanide
@Lanthanide the pointer is a variable that lives in a memory space; that space is treated as volatile (when that variable is accessed), but the compiler is free to optimize reads to the memory pointed-at by that variable (e.g. *x + *x requires two reads of the pointer, but may yield only one read of the address pointed at by x).Mcfarland
@Mcfarland your comment is correct but I don't see how "So it can change and may have different values in different threads." summarizes your comment. All variables only have one value, regardless of how many threads there are.Lanthanide
@Mcfarland "may yield only one read" Which CPU makes that transformation worthwhile? Is there any compiler that attempts such transformation?Lelia
@Lelia x86 CPUs do. They read data from memory and cache before it's needed and reuse read values across instructions.Backscratcher
@DavidSchwartz Which CPU can read data before it's needed when the pointer has not yet been read?Lelia
@Lelia The x86 CPU. It performs speculative fetches. There's no theoretical limit to how clever it can be though, of course, there are practical limits.Backscratcher
@DavidSchwartz I see: the CPU could try to guess the address, pre-fetch, load address, verify guess, use pre-fetched value. Correct?Lelia
@Lelia Yes. And it does in fact do this. And even if no current CPU does this, it would be insane to write code on the assumption that future CPUs won't do this. If your code is relying on behavior to remain the same when that behavior is not guaranteed, it's terrible code. Sane systems provide sufficient guarantees to write software and you should use those guarantees so your code won't fail with a new CPU, new compiler, optimization, or whatever.Backscratcher
I get it. But whether *x + *x loads *x once or twice when x has volatile type has no impact on correctness of MT programs. (The only case where it conceivably could matter that *x is read exactly twice is when reading that address has side effects. Then the address would by mapped such that it is not cachable and the CPU would not try any optimisation.)Lelia
@Lelia Yes, but whether i+=*x; j+=*y; i+=*x; loads *x once or twice can. Why rely on behavior that's explicitly not guaranteed when your platform provides you mechanisms whose behavior is guaranteed? It's madness.Backscratcher
@DavidSchwartz "mechanisms whose behavior is guaranteed" But the committee admitted that the consume order was badly specified (indeed it's a ridiculous specification), not really supported by most compilers (or even with an attempted support that randomly fails), where people could use volatile instead (on those CPU that have load=consume).Lelia
Let us continue this discussion in chat.Backscratcher
@Lelia - compilers optimize for common sub-expressions. So, in my example, *x + *x, without optimization translates into two pointer dereference machine instructions. With optimization, it'll be read only once (most likely - it's more complicated than that). Using the volatile keyword will prevent that optimization. That's important because sometimes a memory read is something more - such as reading memory mapped from a hardware device - and changing the number of reads can yield unexpected results.Mcfarland
@Lelia - as for multi-threaded applications, one read versus two can have very real differences in results because another thread may be changing the value in-between the reads. And yes, the timing is insanely tight, but it can - and does - happen.Mcfarland
@Mcfarland Yes sometimes a memory location read doesn't have read semantic. (Just like on linux a read from a file that happens to be a device could trigger any action that the device driver code deems appropriate.) Using memory mapped device requires the appropriate setting of the memory management unit of the CPU to avoid caching. OTOH reading an atomic variable (a volatile field in Java) has read semantic.Lelia
@Mcfarland "*x + *x, without optimization translates into two pointer dereference machine instructions" using a commutative operator like operator+(int,int) nullify it, but you have an order of evaluation issue of you perform two "read actions" in the same statement, without a sequence point. In general you need to impose an ordering on operations: consider *x - *x. The behavior would be unspecified and compiler dependent.Lelia
@Lelia - that simple *x + *x expression is for explanatory purposes only.Mcfarland

© 2022 - 2024 — McMap. All rights reserved.