Here is the strict aliasing rule in action: one assumption made by the C (or C++) compiler, is that dereferencing pointers to objects of different types will never refer to the same memory location (i.e. alias each other.)
This function
int f(struct t1* p1, struct t2* p2);
assumes that p1 != p2
because they formally point to different types. As a result the optimizatier may assume that p2->m = -p2->m;
have no effect on p1->m
; it can first read the value of p1->m
to a register, compare it with 0, if it compare less than 0, then do p2->m = -p2->m;
and finally return the register value unchanged!
The union here is the only way to make p1 == p2
on binary level because all union member have the same address.
Another example:
struct t1 { int m; };
struct t2 { int m; };
int f(struct t1* p1, struct t2* p2)
{
if (p1->m < 0) p2->m = -p2->m;
return p1->m;
}
int g()
{
union {
struct t1 s1;
struct t2 s2;
} u;
u.s1.m = -1;
return f(&u.s1, &u.s2);
}
What must g
return? +1
according to common sense (we change -1 to +1 in f
). But if we look at gcc's generate assembly with -O1
optimization
f:
cmp DWORD PTR [rdi], 0
js .L3
.L2:
mov eax, DWORD PTR [rdi]
ret
.L3:
neg DWORD PTR [rsi]
jmp .L2
g:
mov eax, 1
ret
So far all is as excepted. But when we try it with -O2
f:
mov eax, DWORD PTR [rdi]
test eax, eax
js .L4
ret
.L4:
neg DWORD PTR [rsi]
ret
g:
mov eax, -1
ret
The return value is now a hardcoded -1
This is because f
at the beginning caches the value of p1->m
in the eax
register (mov eax, DWORD PTR [rdi]
) and does not reread it after p2->m = -p2->m;
(neg DWORD PTR [rsi]
) - it returns eax
unchanged.
union here used only for
all non-static data members of a union object have the same address. as result &u.s1 == &u.s2
.
is somebody not understand assembler code, can show in c/c++ how strict aliasing affect f code:
int f(struct t1* p1, struct t2* p2)
{
int a = p1->m;
if (a < 0) p2->m = -p2->m;
return a;
}
compiler cache p1->m
value in local var a
(actually in register of course) and return it , despite p2->m = -p2->m;
change p1->m
. but compiler assume that p1
memory not affected, because it assume that p2
point to another memory which not overlap with p1
so with different compilers and different optimization level the same source code can return different values (-1 or +1). so and undefined behavior as is
f
can assume thatp1 != p2
because they point to different types. and with optimization - readp1->m
value in register and return this register. it assume thatp2->m = -p2->m
not modifyp1->m
what is wrong. union here only way make p1==p2 – Emmie