I am trying to better understand a rather surprising discovery regarding unions and the common initial sequence rule. The common initial sequence rule says (class.mem 23):
In a standard-layout union with an active member of struct type T1, it is permitted to read a non-static data member m of another union member of struct type T2 provided m is part of the common initial sequence of T1 and T2; the behavior is as if the corresponding member of T1 were nominated.
So, given:
struct A {
int a;
double x;
};
struct B {
int b;
};
union U {
A a;
B b;
};
U u;
u.a = A{};
int i = u.b.b;
This is defined behavior and i
should have the value 0
(because A
and B
have a CIS of their first member, an int
). So far, so good. The confusing thing is that if B
is replaced by simply an int:
union U {
A a;
int b;
};
...
int i = u.b;
According to the definition of common initial sequence:
The common initial sequence of two standard-layout struct types is...
So CISs can only apply between two standard-layout structs. And in turn:
A standard-layout struct is a standard-layout class defined with the class-key struct or the class-key class.
So a primitive type very definitely does not qualify; that is it cannot have a CIS with anything, so A
has no CIS with an int
. Therefore the standard says that the first example is defined behavior, but the second is UB. This simply does not make any sense to me at all; the compiler intuitively is at least as restricted with a primitive type as with a class. If this is intentional, is there any rhyme or reason (perhaps alignment related) as to why this makes sense? Is it possibly a defect?
struct A { int a; ... }
andint
should begin with the same memory layout. – Celanese