Deduplicator is correct. The undefined behaviour that allows compilers to implement "strict aliasing" optimizations doesn't apply when character values are being used to produce a representation of an object.
Certain object representations need not represent a value of the object type. If the
stored value of an object has such a representation and is read by an lvalue expression
that does not have character type, the behavior is undefined. If such a representation
is produced by a side effect that modifies all or any part of the object by an lvalue
expression that does not have character type, the behavior is undefined. Such a
representation is called a trap representation.
However your second example has undefined behaviour because foo
is uninitialized. If you initialize foo
then it only has implementation defined behaviour. It depends on the implementation defined alignment requirements of long long
and whether long long
has any implementation defined pad bits.
Consider if you change your second example to this:
long long bar() {
char *foo = malloc(sizeof(long long));
char c;
for(c = 0; c < sizeof(long long); c++)
foo[c] = c;
long long *p = (long long *) p;
return *p;
}
Now alignment is no longer issue and this example is only dependent of the implementation defined representation of long long
. What value is returned depends on the representation of long long
but if that representation is defined as having no pad bits them this function must always return the same value and it must also always be a valid value. Without pad bits this function can't generate a trap representation, and so the compiler cannot perform any strict aliasing type optimizations on it.
You have to look pretty hard to find a standard conforming implementation of C that has implementation defined pad bits in any of its integer types. I doubt you'll find one that implements any sort of strict aliasing type of optimization. In other words, compilers don't use the undefined behaviour caused by accessing a trap representation to allow strict-aliasing optimizations because no compiler that implements strict-aliasing optimizations has defined any trap representations.
Note also that had buf
been initialized with all zeros ('\0'
characters) then this function wouldn't have any undefined or implementation defined behaviour. An all-bits-zero representation of a integer type is guaranteed not to be a trap representation and guaranteed to have the value 0.
Now for a strictly conforming example that uses char
values to create a guaranteed valid (possibly non-zero) representation of a long long
value:
#include <stdio.h>
#include <stdlib.h>
int
main(int argc, char **argv) {
int i;
long long l;
char *buf;
if (argc < 2) {
return 1;
}
buf = malloc(sizeof l);
if (buf == NULL) {
return 1;
}
l = strtoll(argv[1], NULL, 10);
for (i = 0; i < sizeof l; i++) {
buf[i] = ((char *) &l)[i];
}
printf("%lld\n", *(long long *)buf);
return 0;
}
This example has no undefined behaviour and is not dependent on the alignment or representation of long long
. This is the sort of code that the character type exception on accessing objects was created for. In particular this means that Standard C lets you implement your own memcpy
function in portable C code.
_Alignas(long long)
to the char array, otherwise mis-alignment might cause UB. – Revierestruct whatever
(depending on the initial sequence) work then? – Revierechar buf[...]; fread(buf, ...); foo(((MyStruct *)buf)->member);
, it doesn't work. – Arezzinichar
... whereaschar
has a special exception carved out for it. – Waaflong long *p = (long long *)&foo[0]; *p
example certainly has a potential for alignment problems. e.g.foo
is on an odd address andp
may need an even (or quad) address. But is this the "strict aliasing rule" issue? Thought that had to do with #99150 – Lordship