Does strict aliasing apply when using pointers to struct members?
Asked Answered
D

2

6

Does test_func the following snippet trigger undefined behavior under the strict aliasing rules when the two arguments partially overlap?

That is the second argument is a member of the first:

#include <stdio.h>

typedef struct
{
    //... Other fields
    int x;
    //... Other fields
} A;

int test_func(A *a, int *x)
{
    a->x = 0;
    *x = 1;
    return a->x;
}

int main()
{
    A a = {0};

    printf("%d\n", test_func(&a, &a.x));

    return 0;
}

Is the compiler allowed to think test_func will just return 0, based on the assumption that A* and int* will not alias? so the *x cannot overwrite the member?

Diacaustic answered 3/2, 2017 at 13:22 Comment(10)
I don't think because this is the purpose of the keyword restrict in C.Holbert
I thought that strict aliasing referred to the case of converting between pointer types, assuming that the types were of the same size. For example, converting int* to float*, on a platform with sizeof int == sizeof float. And I thought that this restriction was due to the fact that each platform may have different alignment requirements, which under certain scenarios might lead to undefined behavior. I don't see any of that in your code.Universalize
As a rule of thumb, you can say that there can only be a ``strict aliasing'' violation if you cast a pointer to a different type or if you convert it to void* and then to another type. Here in your case, all type information is present to the compiler, so it must not create problems.Fulks
@barakmanos Strict aliasing is described by a hard-to-read section of the C standard 6.5 (see this answer). It is about doing a pointer conversion followed by a value access. If the pointed-at data is not compatible, as determined by that list of rules, then there is undefined behavior and the compiler is allowed to assume that the scenario won't happen. For example (ignoring alignment and endianess) the code long i=1; *(short*)&i = 2; contains a strict aliasing violation and the compiler is therefore free to give the result 1. Or crash and burn.Cia
("Strict aliasing" is an informal term that does not appear in the C standard, and maybe not the best name for it. It would be more accurate to call it "effective type value access" or something like that.)Cia
@Lundin: So you're saying that float f = 5; int i = *(int*)&f; is not strict aliasing on a platform with sizeof int == sizeof float (i.e., no UB if ignoring alignment and endianess)?Universalize
@barakmanos your example is still a violation of strict aliasing. The size of the types is not relevant, nor is there alignment (though that might present a different cause of undefined behavior for your example). The rule says that an object of some particular type can only be accessed by lvalues (including dereference of pointers) of that same type, with a few exceptions. Outside of aliasing, there's also actually no guarantee that (int *)&f and &f will even point to the same object, though most real-world compilers and programs assume that it does.Licensee
@barakmanos It is a strict aliasing violation. The effective types are float and int, which are not compatible types. And none of the special exceptions (unions, arrays, character types) apply. But this has nothing to do with type sizes or alignment, but with the concept of compatible type in C.Cia
Anyway, the C and C++ canonical duplicate What is the strict aliasing rule? is good reading, with several links to more good reading.Cia
@Cia "you can say that there can only be a ``strict aliasing'' violation if" ... or you use unions. You forgot unions.Tiffin
C
7

Strict aliasing refers to when a pointer is converted to another pointer type, after which the contents are accessed. Strict aliasing means that the involved pointed-at types must be compatible. That does not apply here.

There is however the term pointer aliasing, meaning that two pointers can refer to the same memory. The compiler is not allowed to assume that this is the case here. If it wants to do optimizations like those you describe, it would perhaps have to add machine code that compares the pointers with each other, to determine if they are the same or not. Which in itself would make the function slightly slower.

To help the compiler optimize such code, you can declare the pointers as restrict, which tells the compiler that the programmer guarantees that the pointers are not pointing at the same memory.

Your function compiled with gcc -O3 results in this machine code:

0x00402D09  mov    $0x1,%edx

Which basically means that the whole function was replaced (inlined) with "set a.x to 1".

But if I rewrite your function as

int test_func(A* restrict a, int* restrict x)
{
    a->x = 0;
    *x = 1;
    return a->x;
}

and compile with gcc -O3, it does return 0. Because I have now told the compiler that a->X and x do not point at the same memory, so it can assume that *x = 1; does not affect the result and skip the line *x = 1; or sequence it before the line a->x = 0;.

The optimized machine code of the restrict version actually skips the whole function call, since it knows that the value is already 0 as per your initialization.

This is of course a bug, but the programmer is to blame for it, for careless use of restrict.

Cia answered 3/2, 2017 at 13:39 Comment(0)
A
4

This is not a violation of strict aliasing. The strict aliasing rule says (simplified) that you can access the value of an object only using an lvalue expression of a compatible type. In this case, the object you're accessing is the member x of main's a variable. This member has type int. And the expression you use to access it (*x) also has type int. So there's no problem.

You may be confusing strict aliasing with restrict. If you had used the restrict keyword in the declaration of one of the pointer parameters, the code would be invalid because restrict prevents you from using different pointers to access the same object - but this is a different issue than strict aliasing.

Anallese answered 3/2, 2017 at 13:33 Comment(2)
The way gcc interprets strict aliasing, given struct foo {int x;}; struct bar {int x;}; union u {struct foo vf; struct bar fb;}; int test(struct foo *pf; struct bar *pb) { pf->x = 1; pb->x=2; return pf->x; } it will generate code that assumes pf->x and pb->x cannot alias despite the fact that both structures are members of the same union type whose complete definition is visible at the point of access. I don't think the Standards can be reasonably interpreted to allow such behavior (it would render meaningless the requirement that the "complete union type" be visible)...Leiker
...but the authors of gcc have indicated they have no intention of supporting the CIS rule for accesses which aren't made through a pointer of the union type (even if the union type is visible, and the objects identified by the pointers in question in question could be--or even are declared as--members of the same union).Leiker

© 2022 - 2024 — McMap. All rights reserved.