Given the code:
struct s1 {unsigned short x;};
struct s2 {unsigned short x;};
union s1s2 { struct s1 v1; struct s2 v2; };
static int read_s1x(struct s1 *p) { return p->x; }
static void write_s2x(struct s2 *p, int v) { p->x=v;}
int test(union s1s2 *p1, union s1s2 *p2, union s1s2 *p3)
{
if (read_s1x(&p1->v1))
{
unsigned short temp;
temp = p3->v1.x;
p3->v2.x = temp;
write_s2x(&p2->v2,1234);
temp = p3->v2.x;
p3->v1.x = temp;
}
return read_s1x(&p1->v1);
}
int test2(int x)
{
union s1s2 q[2];
q->v1.x = 4321;
return test(q,q+x,q+x);
}
#include <stdio.h>
int main(void)
{
printf("%d\n",test2(0));
}
There exists one union object in the entire program--q
. Its active member is set to v1
, and then to v2
, and then to v1
again. Code only uses the address-of operator on q.v1
, or the resulting pointer, when that member is active, and likewise q.v2
. Since p1
, p2
, and p3
are all the same type, it should be perfectly legal to use p3->v1
to access p1->v1
, and p3->v2
to access p2->v2
.
I don't see anything that would justify a compiler failing to output 1234, but many compilers including clang and gcc generate code that outputs 4321. I think what's going on is that they decide that the operations on p3 won't actually change the contents of any bits in memory, they can just be ignored altogether, but I don't see anything in the Standard that would justify ignoring the fact that p3
is used to copy data from p1->v1
to p2->v2
and vice versa.
Is there anything in the Standard that would justify such behavior, or are compilers simply not following it?
unsigned x
instead ofunsigned short x
, do you see the same problem? – Onomatologyunsigned char
and then writing them back (which compilers don't support either) and it was more convenient to do that with two bytes than four. The problem is that the compiler completely optimizes out the operations onp3
and loses the aliasing-related information provided thereby. – Pincinceunsigned
would fail in a like-wise manner asunsigned short
. Withunsigned
, we can set aside any of the usual promotions issues - which shouldn't affect this,. – Onomatologyunsigned short
could promote as eitherint
orunsigned
, coercion of values 32767u and below toint
is fully defined by the Standard on all implementations. – Pincince-fsanitize=undefined
and see what UBSan alerts to. You have to run your program with its test data because UBSan is a realtime checker. It does not produce false positives. – Mulkey