Should volatile be used when mapping GPU memory?

Both OpenGL and Vulkan allow to obtain a pointer to a part of GPUs memory by using glMapBuffer and vkMapMemory respectively. They both give a void* to the mapped memory. To interpret its contents as some data it has to be cast to an appropriate type. The simplest example could be to cast to a float* to interpret the memory as an array of floats or vectors or similar.

It seems that any kind of memory mapping is undefined behaviour in C++, as it has no notion of memory mapping. However, this isn't really an issue because this topic is outside of the scope of the C++ Standard. However, there is still the question of volatile.

In the linked question the pointer is additionally marked as volatile because the contents of the memory it points at can be modified in a way that the compiler cannot anticipate during compilation. This seems reasonable though I rarely see people use volatile in this context (more broadly, this keyword seems to be barely used at all nowadays).

At the same time in this question the answer seems to be that using volatile is unnecessary. This is due to the fact that the memory they speak of is mapped using mmap and later given to msync which can be treated as modifying the memory, which is similar to explicitly flushing it in Vulkan or OpenGL. I'm afraid though that this doesn't apply to neither OpenGL nor Vulkan.

In case of the memory being mapped as not GL_MAP_FLUSH_EXPLICIT_BIT or it being VK_MEMORY_PROPERTY_HOST_COHERENT_BIT than no flushing is needed at all and the memory contents update automagically. Even if the memory is flushed by hand by using vkFlushMappedMemoryRanges or glFlushMappedBufferRange neither of these functions actually takes the mapped pointer as a parameter, so the compiler couldn't possibly know that they modify the mapped memory's contents.

As such, is it necessary to mark pointers to mapped GPU memory as volatile? I know that technically this is all undefined behaviour, but I am asking what is required in practice on real hardware.

By the way, neither the Vulkan Specification or the OpenGL Specification mention the volatile qualifier at all.

EDIT: Would marking the memory as volatile incur a performance overhead?

OK, let's say that we have a compiler that is omniscient about everything that happens in your code. This means that the compiler can follow any pointer, even through the runtime execution of your code perfectly and correctly every time, no matter how you try to hide it. So even if you read a byte at one end of your program, the compiler will somehow remember the exact bytes you've read and anytime you try to read them again, it can choose to not execute that read and just give you the previous value, unless the compiler is aware of something that can change it.

But let's also say that our omniscient compiler is completely oblivious to everything that happens in OpenGL/Vulkan. To this compiler, the graphics API is a black box. Here, there be dragons.

So you get a pointer from the API, read from it, the GPU writes to it, and then you want to read that new data the GPU just wrote. Why would a compiler believe that the data behind that pointer has been altered; after all, the alterations came from outside of the system, from a source that the C++ standard does not recognize.

That's is what volatile is for, right?

Well, here's the thing. In both OpenGL and Vulkan, to ensure that you can actually read that data, you need to do something. Even if you map the memory coherently, you have to make an API call to ensure that the GPU process that wrote to the memory has actually executed. For Vulkan, you're waiting on a fence or an event. For OpenGL, you're waiting on a fence or executing a full finish.

Either way, before executing the read from the memory, the omniscient compiler encounters a function call into a black box which as established earlier the compiler knows nothing about. Since the mapped pointer itself came from the same black box, the compiler cannot assume that the black box doesn't have a pointer to that memory. So as far as the compiler is concerned, calling those functions could have written data to that memory.

And therefore, our omniscient-yet-oblivious compiler cannot optimize away such memory accesses. Once we get control back from those functions, the compiler must assume that any memory from any pointer reachable through that address could have been altered.

And if the compiler were able to peer into the graphics API itself, to read and understand what those functions are doing, then it would definitely see things that would tell it, "oh, I should not make assumptions about the state of memory retrieved through these pointers."

This is why you don't need volatile.

Also, note that the same applies to writing data. If you write to persistent, coherent mapped memory, you still have to perform some synchronization action with the graphics API so that your CPU writes so that the GPU isn't reading it. So that's where the compiler knows that it can no longer rely on its knowledge of previously written data.

Recommended topics

Hot tags