I've been writing embedded C code for many years now, and the newer generations of compilers and optimizations have certainly gotten a lot better with respect to their ability to warn about questionable code.
However, there is at least one (very common, in my experience) use-case that continues to cause grief, wheres a common base type is shared between multiple structs. Consider this contrived example:
#include <stdio.h>
struct Base
{
unsigned short t; /* identifies the actual structure type */
};
struct Derived1
{
struct Base b; /* identified by t=1 */
int i;
};
struct Derived2
{
struct Base b; /* identified by t=2 */
double d;
};
struct Derived1 s1 = { .b = { .t = 1 }, .i = 42 };
struct Derived2 s2 = { .b = { .t = 2 }, .d = 42.0 };
void print_val(struct Base *bp)
{
switch(bp->t)
{
case 1:
{
struct Derived1 *dp = (struct Derived1 *)bp;
printf("Derived1 value=%d\n", dp->i);
break;
}
case 2:
{
struct Derived2 *dp = (struct Derived2 *)bp;
printf("Derived2 value=%.1lf\n", dp->d);
break;
}
}
}
int main(int argc, char *argv[])
{
struct Base *bp1, *bp2;
bp1 = (struct Base*) &s1;
bp2 = (struct Base*) &s2;
print_val(bp1);
print_val(bp2);
return 0;
}
Per ISO/IEC9899, the casts within code above should be OK, as it relies on the first member of the structure sharing the same address as the containing structure. Clause 6.7.2.1-13 says so:
Within a structure object, the non-bit-field members and the units in which bit-fields
reside have addresses that increase in the order in which they are declared. A pointer to a
structure object, suitably converted, points to its initial member (or if that member is a
bit-field, then to the unit in which it resides), and vice versa. There may be unnamed
padding within a structure object, but not at its beginning.
The casts from derived to base work fine, but the cast back to the derived type within print_val()
generates an alignment warning. However this is known to be safe as it is specifically the "vice versa" part of the clause above. The problem is that the compiler simply doesn't know that the we've already guaranteed that the structure is in fact an instance of the other type via other means.
When compiled with gcc version 9.3.0 (Ubuntu 20.04) using flags -std=c99 -pedantic -fstrict-aliasing -Wstrict-aliasing -Wcast-align=strict -O3
I get:
alignment-1.c: In function ‘print_val’:
alignment-1.c:30:31: warning: cast increases required alignment of target type [-Wcast-align]
30 | struct Derived1 *dp = (struct Derived1 *)bp;
| ^
alignment-1.c:36:31: warning: cast increases required alignment of target type [-Wcast-align]
36 | struct Derived2 *dp = (struct Derived2 *)bp;
| ^
A similar warning occurs in clang 10.
Rework 1: pointer to pointer
A method used in some circumstances to avoid the alignment warning (when the pointer is known to be aligned, as is the case here) is to use an intermediate pointer-to-pointer. For instance:
struct Derived1 *dp = *((struct Derived1 **)&bp);
However this just trades the alignment warning for a strict aliasing warning, at least on gcc:
alignment-1a.c: In function ‘print_val’:
alignment-1a.c:30:33: warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
30 | struct Derived1 *dp = *((struct Derived1 **)&bp);
| ~^~~~~~~~~~~~~~~~~~~~~~~~
Same is true if cast done as an lvalue, that is: *((struct Base **)&dp) = bp;
also warns in gcc.
Notably, only gcc complains about this one - clang 10 seems to accept this either way without warning, but I'm not sure if that's intentional or not.
Rework 2: union of structures
Another way to rework this code is using a union. So the print_val()
function can be rewritten something like:
void print_val(struct Base *bp)
{
union Ptr
{
struct Base b;
struct Derived1 d1;
struct Derived2 d2;
} *u;
u = (union Ptr *)bp;
...
The various structures can be accessed using the union. While this works fine, the cast to a union is still flagged as violating alignment rules, just like the original example.
alignment-2.c:33:9: warning: cast from 'struct Base *' to 'union Ptr *' increases required alignment from 2 to 8 [-Wcast-align]
u = (union Ptr *)bp;
^~~~~~~~~~~~~~~
1 warning generated.
Rework 3: union of pointers
Rewriting the function as follows compiles cleanly in both gcc and clang:
void print_val(struct Base *bp)
{
union Ptr
{
struct Base *bp;
struct Derived1 *d1p;
struct Derived2 *d2p;
} u;
u.bp = bp;
switch(u.bp->t)
{
case 1:
{
printf("Derived1 value=%d\n", u.d1p->i);
break;
}
case 2:
{
printf("Derived2 value=%.1lf\n", u.d2p->d);
break;
}
}
}
There seems to be conflicting information out there as to whether this is truly valid. In particular, an older aliasing write-up at https://cellperformance.beyond3d.com/articles/2006/06/understanding-strict-aliasing.html specifically calls out a similar construct as being invalid (see Casting through a union (3) in that link).
In my understanding, because pointer members of the union all share a common base type, this doesn't actually violate any aliasing rules, because all accesses to struct Base
will in fact be done via an object of type struct Base
- whether by dereferencing the bp
union member or by accessing the b
member object of the d1p
or d2p
. Either way it is accessing the member correctly via an object of type struct Base
- so as far as I can tell, there is no alias.
Specific Questions:
- Is the union-of-pointers suggested in rework 3 a portable, safe, standards compliant, acceptable method of doing this?
- If not, is there a method that is fully portable and standards compliant, and does not rely on any platform-defined/compiler-specific behavior or options?
It seems to me that since this pattern is fairly common in C code (in the absence of true OO constructs like in C++) that it should be more straightforward to do this in a portable way without getting warnings in one form or another.
Thanks in advance!
Update:
Using an intermediate void*
may be the "right" way to do this:
struct Derived1 *dp = (void*)bp;
This certainly works but it really allows any conversion at all, regardless of type compatibility (I suppose the weaker type system of C is fundamentally to blame for this, what I really want is an approximation of C++ and the static_cast<>
operator)
However, my fundamental question (misunderstanding?) about strict aliasing rules remains:
Why does using a union type and/or pointer-to-pointer violate strict aliasing rules? In other words what is fundamentally different between what is done in main (taking address of b
member) and what is done in print_val()
other than the direction of the conversion? Both yield the same situation - two pointers that point to the same memory, which are different struct types - a struct Base*
and a struct Derived1*
.
It would seem to me that if this were violating strict aliasing rules in any way, the introduction of an intermediate void*
cast would not change the fundamental problem.
void*
option in my OP. The only reason I feel this is less-than-ideal is because it is really "anything goes" -- I'd like to get a warning if casting between types that are truly incompatible. But avoid*
basically means anything goes. – Knap