It is clear to me that the C standard forbids this program
Not really, it doesn't cover what will happen if you type pun from a character array into a struct - it is undefined behavior, since it violates a "shall" in C17 6.5/7, but not a constraint.
Regarding all the "strict aliasing sucks am I right?" rants... yes and no. The original purpose of these rules was to disallow wild and crazy conversions. The C99 rationale 5.10 chapter 6.5/35 shows this example:
int a;
void f(int * b)
{
a = 1;
*b = 2;
g(a);
}
It is tempting to generate the call to g as if the source expression were g(1)
, but b
might point to a
, so this optimization is not safe. On the other hand, consider
int a;
void f( double * b )
{
a = 1;
*b = 2.0;
g(a);
}
Again the optimization is incorrect only if b
points to a
. However, this would only have come about if the address of a were somewhere cast to double*
. The C89 Committee has decided that such dubious possibilities need not be allowed for.
This is the original rationale and C99 extended the unclear rules of C89 a bit with the introduction of effective type, for better and worse. The rules are still very much unclear, but the original intention is to allow compilers to not having to make weird assumptions as the above. So far it is a perfectly sensible assumption that compilers should be allowed to make.
Unfortunately somewhere in the early 2000s, some compilers most notably gcc decided to abuse this in order to perform optimizations. Suddenly you couldn't do things like uint8_t arr[2]; ... *(uint16_t*)arr
because that's strictly speaking a strict aliasing violation. Until C99 compilers had generated sensible code without such optimizations, but past C99 some chose to go haywire. The situation has improved somewhat over the years but we can still not rely on compilers to generate "the expected" code in my little uint16_t*
conversion above.
The number of exceptions to the strict aliasing rules in C17 6.5/7 leaves a lot to be desired. For example it is perfectly sensible to type pun between various unsigned integer types - anyone who's done hardware-related programming understands this. But this isn't allowed.
And as another example there's no mentioning what will happen with type qualifiers - nobody in the whole world seems to be able to answer this: What rules are there for qualifiers of effective type? - I have no idea of what rules there are myself.
It's unclear how to use arrays in relation to effective type... the list goes on. There's numerous Defect Reports about various details of these rules but they haven't been improved.
As for if your program contains any strict aliasing violations and how to fix it:
unsigned char buffer[SIZE];
has the effective type (array of) unsigned char
.
const uintptr_t start = (uintptr_t)(buffer+free_slot);
is fine assuming that you don't end up with misalignment, but that's a separate issue.
- When you de-reference the pointer from the caller side and make a lvalue access as
int
or a struct type etc, there is a strict aliasing violation, since this is not one of the allowed exceptions in the list 6.5/7. The other way around - going from a larger type and accessing byte by byte with character type pointers would be fine.
So to fix it you have to make something like this, for the int
example:
typedef union
{
int i;
unsigned char bytes[sizeof(int)];
} intalias_t;
Now you can do:
intalias_t* p1 = alloc(sizeof(int),alignof(int));
(*p1).i = 143; // well-defined
Because (*p1).i
is "an lvalue expression that" is "an aggregate or union type that includes" "a type compatible with the effective type of the object". That is, the union contains a character type array which is (supposedly) compatible with the effective type which is also a character type. "Supposedly" since the rules are muddy when it comes to array access. And if your original array or the one in the union contained a type qualifier, nobody knows(?) what will happen.
When in doubt/as a rule of thumb, use -fno-strict-aliasing
.