May the compiler optimize out stores though a pointer-to-volatile? [duplicate]

Writes to volatile variables are somehow side effects in C++ and generally can't be optimized out under as-if rule, usually. In practice, this usually means that on inspection of the assembly you'll see one one store for each volatile store by the abstract machine¹.

However, it isn't clear to me if the stores must be performed in the following case where the underlying object is not volatile, but the stores are done through a pointer-to-volatile:

void vtest() {
    int buf[1];

    volatile int * vptr = buf;

    *vptr = 0;
    *vptr = 1;
    *vptr = 2;
}

Here, gcc does in fact optimize out all of the stores. Clang does not. Oddly, the behavior depends on the buffer size: with buf[3] gcc emits the stores, but with buf[4] it doesn't and so on.

Is gcc's behavior here legal?

^{[1] with some small variations, e.g., some compilers will use a single read-modify-write instruction on x86 to implement something like v++ where v is volatile).}

While it would be useful for the C and C++ Standards to recognize a categories of implementations where volatile loads and stores have particular semantics, and report via predefined macros, intrinsics, or other such means what semantics a particular implementation is using, neither implementation presently does so. Given a loop:

void hang_if_nonzero(int mode)
{
  int i = 1;
  do { +*(volatile int*)0x1234; } while(mode);
}

a compiler would be required to generate code that will block program execution if mode is non-zero, because the volatile read is defined as being a side effect in and of itself, regardless of whether there is any means by which the effect of executing it could be distinguished from that of skipping it. There would be no requirement, however, that the compiler actually generate any load instructions for the volatile access. If the compiler specified that it was only for use on hardware platforms where the effect of reading address 0x1234 would be indistinguishable from the effect of skipping the read, it would be allowed to skip the read.

In cases where an object's address is taken, but a compiler can account for all the ways in which the address is used and code never inspects the representation of the address, a compiler would not be required to allocate "normal" addressable storage but may at its leisure allocate a register or other form of storage which wouldn't be accessed via normal loads and stores. It may even pretend to allocate storage without actually doing so if it can tell what value an object would contain when accessed. If e.g. a program were to do something like:

int test(int mode)
{
  int a[2] = {1,2};
  int *p = a;
  return p[mode & 1] + p[mode & 1];
}

a compiler wouldn't be required to actually allocate any storage for a, but could instead at its leisure generate code equivalent to return (1+(mode & 1)) << 1;. Even if p were declared as int volatile *p = a;, that would not create a need for the compiler to allocate addressable storage for a, since a compiler could still account for everything done through pointers to a and would thus have no obligation to keep a in addressable storage. A compiler would thus be allowed to treat a read of a[mode & 1] as equivalent to evaluating the expression (1+(mode & 1)). If the read were done through a volatile pointer, then it would need to be treated as a side effect for purposes of determining whether a loop may be omitted, but there would be no requirement that the read itself actually do anything.

Recommended topics

Hot tags