I have been reading about the strict aliasing rule for a while, and I'm starting to get really confused. First of all, I have read these questions and some answers:
- strict-aliasing-rule-and-char-pointers
- when-is-char-safe-for-strict-pointer-aliasing
- is-the-strict-aliasing-rule-really-a-two-way-street
According to them (as far as I understand), accessing a char
buffer using a pointer to another type violates the strict aliasing rule. However, the glibc implementation of strlen()
has such code (with comments and the 64-bit implementation removed):
size_t strlen(const char *str)
{
const char *char_ptr;
const unsigned long int *longword_ptr;
unsigned long int longword, magic_bits, himagic, lomagic;
for (char_ptr = str; ((unsigned long int) char_ptr
& (sizeof (longword) - 1)) != 0; ++char_ptr)
if (*char_ptr == '\0')
return char_ptr - str;
longword_ptr = (unsigned long int *) char_ptr;
himagic = 0x80808080L;
lomagic = 0x01010101L;
for (;;)
{
longword = *longword_ptr++;
if (((longword - lomagic) & himagic) != 0)
{
const char *cp = (const char *) (longword_ptr - 1);
if (cp[0] == 0)
return cp - str;
if (cp[1] == 0)
return cp - str + 1;
if (cp[2] == 0)
return cp - str + 2;
if (cp[3] == 0)
return cp - str + 3;
}
}
}
The longword_ptr = (unsigned long int *) char_ptr;
line obviously aliases an unsigned long int
to char
. I fail to understand what makes this possible. I see that the code takes care of alignment problems, so no issues there, but I think this is not related with the strict aliasing rule.
The accepted answer for the third linked question says:
However, there is a very common compiler extension allowing you to cast properly aligned pointers from char to other types and access them, however this is non-standard.
Only thing comes to my mind is the -fno-strict-aliasing
option, is this the case? I could not find it documented anywhere what glibc implementors depend on, and the comments somehow imply that this cast is done without any concerns like it is obvious that there will be no problems. That makes me think that it is indeed obvious and I am missing something silly, but my search failed me.
(unsigned long int) char_ptr
is also fishy. And they go through all this trouble to attempt some weird optimization which adds extra branches and doesn't necessary look faster, possibly slower. – Floatersizeof(unsigned long int) == sizeof(void *)
, andtypedef unsigned long int uintptr_t;
in<stdint.h>
also appears. Here's the full implementation with nothing stripped for those interested. Similar code casting a pointer tounsigned long int
can be found in the implementation ofmemcpy
. On my machine using GCC (-std=gnu11|c11
), the resulting behavior is the same,-fstrict-aliasing
or not. No diagnostics appear during compilation – Psychotechnics