Type punning and aliasing are distinct but related concepts that some compiler writers seem unable to distinguish despite their being largely orthogonal.
Type punning refers to situations in which storage is written as one type and read as another type, typically for the purpose of allowing a value to be interpreted as a sequence of bits, allowing a sequence of bits to be interpreted as a value, or allowing a value to be used as another type whose representation matches, at least in the portion of interest. For example, the latter form of type punning may be useful in situations where one may have pointers to a variety of structure types, all of which share a Common Initial Sequence, and may need to operate on common-initial-sequence members of all of those structures despite the structures' different types. Note that even though the Standard includes explicit guarantees which would suggest that the latter form of type punning is supposed to be useful, compilers that confuse it with aliasing don't support such constructs.
Aliasing refers to a different concept in which storage is accessed using two or more simultaneously-active but seemingly-unrelated means, in ways that interact with each other. Given something like:
int test1(int *p1, int *p2)
{
*p1 = 1;
*p2 = 2;
return *p1;
}
if p1==p2
, then p1
and p2
will alias since p1
will be used to access the storage identified by p2
sometime between the creation and last use of p2
, in a context wherein p1
cannot have been created from p2
[it's possible that p1
might have been created from p2
before the function was called, but there's no way p1
could have been derived from p2
within the function]. Because the Standard allows aliasing between lvalues that identify the same type, however, the above construct would have defined behavior when p1==p2
, despite the fact that p1
and p2
alias.
On the other hand, given something like:
struct s1 {int x; };
struct s2 {int x; };
union s1s2 {struct s1 v1; struct s2 v2; } uarr[100];
int test1(int i, int j)
{
int temp;
{ struct s1 *p1 = &uarr[i].v1; temp = p1->x; }
if (temp)
{ struct s2 *p2 = &uarr[j].v2; p2->x = 1; }
{ struct s1 *p3 = &uarr[i].v1; temp = p3->x; }
return temp;
}
Here, the pointers p1
, p2
, and p3
have obviously-disjoint lifetimes and consequently are not simultaneously active and do alias each other. Each pointer is independently derived from uarr
, and the lifetime of each pointer will end prior to the next use of uarr
. Consequently, this code makes use of type punning to access the same storage as both a struct s1
and a struct s2
, but as written does not exploit aliasing since all the accesses to the storage in question are visibly derived from the same root-level object uarr
.
Unfortunately, even though type-based access rules were intended (according to both the Rationale and a footnote) to indicate when things are allowed to alias, some compilers interpret them in ways that make language features such as the Common Initial Sequence guarantee essentially useless, since they use the type-access rules as an excuse to rewrite the code in such a way as to remove the derivation of p3
from uarr
, thus introducing aliasing where there had been none.