To be able to take a pointer to A, and reinterpret it as a pointer to B, they must be pointer-interconvertible.
Pointer-interconvertible is about objects, not types of objects.
In C++, there are objects at places. If you have a Big
at a particular spot with at least one member existing, there is also a Hdr
at that same spot due to pointer interconvertability.
However there is no Little
object at that spot. If there is no Little
object there, it cannot be pointer-interconvertible with a Little
object that isn't there.
They appear to be layout-compatible, assuming they are flat data (plain old data, trivially copyable, etc).
This means you can copy their byte representation and it works. In fact, optimizers seem to understand that a memcpy to a stack local buffer, a placement new (with trivial constructor), then a memcpy back is actually a noop.
template<class T>
T* laundry_pod( void* data ) {
static_assert( std::is_pod<Data>{}, "POD only" ); // could be relaxed a bit
char buff[sizeof(T)];
std::memcpy( buff, data, sizeof(T) );
T* r = ::new( data ) T;
std::memcpy( data, buff, sizeof(T) );
return r;
}
the above function is a noop at runtime (in an optimized build), yet it converts T-layout-compatible data at data
to an actual T
.
So, if I am right and Big
and Little
are layout-compatible when Big
is a subtype of the types in Little
, you can do this:
Little* inplace_to_little( Big* big ) {
return laundry_pod<Little>(big);
}
Big* inplace_to_big( Little* big ) {
return laundry_pod<Big>(big);
}
or
void given_big(Big& big) { // cannot be const
switch(big.h.type) {
case B::type: // fallthrough
case C::type:
auto* little = inplace_to_little(&big); // replace Big object with Little inplace
given_b_or_c(*little);
inplace_to_big(little); // revive Big object. Old references are valid, barring const data or inheritance
break;
// ... other cases here ...
}
}
if Big
has non-flat data (like references or const
data), the above breaks horribly.
Note that laundry_pod
doesn't do any memory allocation; it uses placement new that constructs a T
in the place where data
points using the bytes at data
. And while it looks like it is doing lots of stuff (copying memory around), it optimizes to a noop.
c++ has a concept of "an object exists". The existence of an object has almost nothing to do with what bits or bytes are written in the physical or abstract machine. There is no instruction on your binary that corresponds to "now an object exists".
But the language has this concept.
Objects that don't exist cannot be interacted with. If you do so, the C++ standard does not define the behavior of your program.
This permits the optimizer to make assumptions about what your code does and what it doesn't do and which branches cannot be reached and which can be reached. It lets the compiler make no-aliasing assumptions; modifying data through a pointer or reference to A cannot change data reached through a pointer or reference to B unless somehow both A and B exist in the same spot.
The compiler can prove that Big
and Little
objects cannot both exist in the same spot. So no modification of any data through a pointer or reference to Little
could modify anything existing in a variable of type Big
. And vice versa.
Imagine if given_b_or_c
modifies a field. Well the compiler could inline given_big
and given_b_or_c
and use_a_b
, notice that no instance of Big
is modified (just an instance of Little
), and prove that fields of data from Big
it cached prior to calling your code could not be modified.
This saves it a load instruction, and the optimizer is quite happy. But now you have code that reads:
Big b = whatever;
b.foo = 7;
((Little&)b).foo = 4;
if (b.foo!=4) exit(-1);
that is optimzied to
Big b = whatever;
b.foo = 7;
((Little&)b).foo = 4;
exit(-1);
because it can prove that b.foo
must be 7
it was set once and never modified. The access through Little
could not modify the Big
due to aliasing rules.
Now do this:
Big b = whatever;
b.foo = 7;
(*laundry_pod<Little>(&b)).foo = 4;
Big& b2 = *laundry_pod<Big>(&b);
if (b2.foo!=4) exit(-1);
and it the assume that the big there was unchanged, because there is a memcpy and a ::new
that could legally change the state of the data. No strict aliasing violation.
It can still follow the memcpy
and eliminate it.
Live example of laundry_pod
being optimized away. Note that if it wasn't optimized away, the code would have to have a conditional and a printf. But because it was, it was optimized into the empty program.
big.b.h.type
(orbig.h.type
) even though the active member isbig.c
. I don't think it is legal to reinterpret cast toLittle
. – Montevideounion Big2 { Hdr h; A a; B b; C c; D d; E e; F f; };
would it be legal toreinterpet_cast
fromBig
toBig2
? My gut feel is "no" (but I can't prove it). – MontevideoB
andC
had a common initial sub-sequence, then possibly... – DecuryBig
have large size compareLittle
(due some extra members have lage size) - the layout ofBig
can be already another (saya
andb
can be placed not at begin ofBig
address ) - convert toLttle
viareinterpret_cast
will be already incorrect. however i not believe that some member can start not at union begin. – WesslingA
,B
etc. all share aHdr
as their first member, then why not simply use polymorphism? Alternatively, why not use a layout where the typesA
,B
etc. don't contain aHdr
, which is kept outside of theunion
:struct Big { Hdr h; union {A a; B b; /* etc */ }; };
– Odeliabig.h.type
which already UB by formal documentation :) – Wesslingbig.h.type
is legal. The "common initial subsequence rule" that the OP referred to forces it. – Montevideobig.h.type
you already implicit assume and use that all members start at the same address – Wesslingbig.h.type
is legal (if all other union members begin exactly fromHdr
) but from formal reference i not view why this is legal, if this is "inactive member" – WesslingBig
have the same address (same ash
address) - only in this case we can readHdr
via different members with same result. and address of some member is equal to union address itself (The union is only as big as necessary to hold its largest data member). so address ofh
is same asBig
andLittle
. so we can reinterpret cast pointer – Wessling