Is this use of the Effective Type rule strictly conforming?

Asked 5/10, 2017 at 18:4 Answered 24/10, 2017 at 11:31

c gcc undefined-behavior c99 strict-aliasing

The Effective Type rule in C99 and C11 provides that storage with no declared type may be written with any type and, that storing a value of a non-character type will set the Effective Type of the storage accordingly.

Setting aside the fact that INT_MAX might be less than 123456789, would the following code's use of the Effective Type rule be strictly conforming?

#include <stdlib.h>
#include <stdio.h>

/* Performs some calculations using using int, then float,
  then int.

    If both results are desired, do_test(intbuff, floatbuff, 1);
    For int only, do_test(intbuff, intbuff, 1);
    For float only, do_test(floatbuff, float_buff, 0);

  The latter two usages require storage with no declared type.    
*/

void do_test(void *p1, void *p2, int leave_as_int)
{
  *(int*)p1 = 1234000000;

  float f = *(int*)p1;
  *(float*)p2 = f*2-1234000000.0f;

  if (leave_as_int)
  {
    int i = *(float*)p2;
    *(int*)p1 = i+567890;
  }
}

void (*volatile test)(void *p1, void *p2, int leave_as_int) = do_test;

int main(void)
{
  int iresult;
  float fresult;
  void *p = malloc(sizeof(int) + sizeof(float));
  if (p)
  {
    test(p,p,1);
    iresult = *(int*)p;
    test(p,p,0);
    fresult = *(float*)p;
    free(p);
    printf("%10d %15.2f\n", iresult,fresult);
  }
  return 0;
}

From my reading of the Standard, all three usages of the function described in the comment should be strictly conforming (except for the integer-range issue). The code should thus output 1234567890 1234000000.00. GCC 7.2, however, outputs 1234056789 1157904.00. I think that when leave_as_int is 0, it's storing 123400000 to *p1 after it stores 123400000.0f to *p2, but I see nothing in the Standard that would authorize such behavior. Am I missing anything, or is gcc non-conforming?

Anaemic answered 5/10, 2017 at 18:4 Comment(0)

Yes, this is a gcc bug. I've filed it (with a simplified testcase) as https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82697 .

Esemplastic answered 24/10, 2017 at 11:31 Comment(7)

Thanks for posting that bug. Interesting that this store-store ordering bug seems to have yielded such a quick fix, but 82697 seems to be taking longer. I don't know that the authors of the Standard intended to require the pessimizations necessary to recognize that *pi and *pl might alias even when there is no evidence to suggest that they might, but certainly when evidence of aliasing exists (as with the pointer casts in the example above) recognizing the possibility is not unreasonably pessimistic, but simply realistic. – Anaemic 24/10, 2017 at 18:50

@Anaemic "pessimistic" "realistic" ... why not probabilistic? Are you talking about language semantics or risk assessment as with pesticides, drugs? – Steeplebush 2/6, 2018 at 17:7

Any particular program will either be 100% reliable even if the compiler reorders the stores, or it won't. There's nothing probabalistic about it. Programs where the ordering of stores would matter in the above situation are far more rare than programs that would require that compilers preserve ordering in some other situations not mandated by 6.5p7. Prior to C99, the authors of the Standard viewed the ability to recognize situations where an lvalue or pointer of one type was derived from an lvalue or pointer of another as purely a Quality of Implementation issue. C99 added rules which... – Anaemic 2/6, 2018 at 22:35

@curiousguy: ...are hard to understand or to process and impede what should be useful optimizations, while failing to handle even simple straightforward cases. A better rule would be to say that if a byte of storage is changed during any particular execution of a function or loop, all accesses must be done using lvalues which are derived, during such execution, from pointers or lvalues which identify the same object or members of the same array. Further, all use of a derived lvalue to access a byte of storage must precede use of any lvalue not derived from it in relation to the same storage. – Anaemic 2/6, 2018 at 22:43

@Anaemic "particular execution of a function" what about inlining then? – Steeplebush 2/6, 2018 at 23:23

@curiousguy: If function X calls function Y which calls function Z, anything that occurs within that execution of function Z occurs within that execution of function Y, whether in-lined or not. If Y derives two two pointers of different types to the same storage and passes them to Z, and Z happens to be in-lined, a compiler might be able to know that the pointers identify the same storage, but a compiler shouldn't be required to know that.since in the non-inline scenario it would have no way of knowing where its arguments came from. On the other hand... – Anaemic 3/6, 2018 at 7:57

...if Y derived a pointer of one type to some storage, passed a pointer to Z1 (which used it to access the storage), and then Y derived a pointer of a different type to the same storage and passed that to Z2 (which also used it), a non-stupid compiler should have no trouble recognizing that the accesses to the storage in Z1 need t be sequenced before those in Z2, since the operation that derives the lvalue used in Z2 would occur after the execution of Z1 and before the execution of Z2. – Anaemic 3/6, 2018 at 8:1

The generated machine code unconditionally writes to both pointers:

do_test:
        cmpl    $1, %edx
        movl    $0x4e931ab1, (%rsi)
        sbbl    %eax, %eax
        andl    $-567890, %eax
        addl    $1234567890, %eax
        movl    %eax, (%rdi)
        ret

This is a GCC bug because all stores are expected to change the dynamic type of the memory accessed. I do not think this behavior is mandated by the standard; it is a GCC extension.

Can you file a GCC bug?

Damning answered 5/10, 2017 at 20:19 Comment(2)

My reading of the linked post suggests that having stores change the dynamic type of objects with a declared type is an extension, but that such behavior is required by the Standard for objects with no declared type [e.g. a pointer received from malloc()]. Upholding the Standard would require that a compiler make certain pessimistic assumptions in cases which will often yield needlessly inefficient code, and it would make sense for compilers to offer a non-conforming mode that wouldn't make allowances for objects' dynamic/Effective Types to change in such cases. – Anaemic 5/10, 2017 at 20:44

Otherwise, I'm not presently registered to file gcc bugs, though perhaps I should register. I'm not sure, though, what should be proposed as a remedy if there's no way the behavior can be described as conforming. Given that upholding the Standard would needlessly impede optimization in some cases, it may be practical to simply say that -fno-strict-aliasing is required for full conformance and specifying that -fstrict-aliasing is only usable with code that never recycles storage. Unfortunately, that would mean any code that needs to recycle storage would be processed inefficiently. – Anaemic 5/10, 2017 at 23:21

Recommended topics

Hot tags