In C, what exactly are the performance benefits that come with observing strict aliasing?
There is a page that describes aliasing very thoroughly here.
There are also some SO topics here and here.
To summarize, the compiler cannot assume the value of data when two pointers of different types are accessing the same location (i.e. it must read the value every time and therefore cannot make optimizations).
This only occurs when strict aliasing is not being enforced. Strict aliasing options:
- gcc: -fstrict-aliasing [default] and -fno-strict-aliasing
- msvc: Strict aliasing is off by default. (If somebody knows how to turn it on, please say so.)
Example
Copy-paste this code into main.c:
void f(unsigned u)
{
unsigned short* const bad = (unsigned short*)&u;
}
int main(void)
{
f(5);
return 0;
}
Then compile the code with these options:
gcc main.c -Wall -O2
And you will get:
main.c:3: warning: dereferencing type-punned pointer will break strict-aliasing rules
Disable aliasing with:
gcc main.c -fno-strict-aliasing -Wall -O2
And the warning goes away. (Or just take out -Wall but...don't compile without it)
Try as I might I could not get MSVC to give me a warning.
restrict
? Your first linked article suggests "If a program requires -fno-strict-alias
one should figure that the programmer probably didn't bother to use restrict
, but if restrict
can offer all the performance advantages of type-based aliasing without the semantic limitations, why shouldn't one disable type-based aliasing, especially given that there's no way of knowing how gcc's interpretation of the rules may change in the future? –
Humfrey f(5)
dereferences the function-pointer-valued expression f
). –
Lassalle The level of performance improvement that will result from applying type-based aliasing will depend upon:
The extent to which code caches things in automatic-duration objects, or via the restrict qualifier, indicates that compilers may do so without regard for whether they might be affected by certain pointer-based operations.
Whether the aliasing assumptions made by a compiler are consistent with what a programmer needs to do (if they're not, reliable processing would require disabling type-based aliasing, negating any benefits it could otherwise have offered).
Consider the following two code snippets:
struct descriptor { uint32_t size; uint16_t *dat; };
void test(struct descriptor *ptr)
{
for (uint32_t i=0; i < ptr->size; i++)
ptr->dat[i] = 1234;
}
void test2(struct descriptor *ptr)
{
int size = ptr->size;
short *dat = ptr->dat;
for (uint32_t i=0; i < size; i++)
dat[i] = 1234;
}
In the absence of type-based aliasing rules, a compiler given test1()
would have to allow for the possibility that ptr->dat
might point to an address within ptr->size
or ptr->dat
. This would in turn require that it either check whether ptr->dat
was in range to access those things, or else reload the contents of ptr->size
and ptr->dat
on every iteration of the loop. In this scenario, type-based aliasing rules might allow for a 10x speedup.
On the other hand, a compiler given test2()
could generate code equivalent to the optimized version of test1()
without having to care about type-based aliasing rules. In this case, performing the same operation, type-based aliasing rules would not offer any speedup.
Now consider the following functions:
uint32_t *ptr;
void set_bottom_16_bits_and_advance_v1(uint16_t value)
{
((uint16_t)ptr)[IS_BIG_ENDIAN] = value;
ptr++;
}
void set_bottom_16_bits_and_advance_v2(uint16_t value)
{
((unsigned char*)ptr)[3*IS_BIG_ENDIAN] = value & 255;
((unsigned char*)ptr)[(3*IS_BIG_ENDIAN) ^ 1] = value >> 8;
ptr++;
}
void test1(unsigned n)
{
for (unsigned i=0; i<n; i++)
set_bottom_16_bits_v1(i);
}
void test2(unsigned n, int value)
{
for (unsigned i=0; i<n; i++)
set_bottom_16_bits_v2(value);
}
If a compiler given set_bottom_16_bits_and_advance_v1
and test1
were--even with type-based aliasing enabled--accommodate the possibility that it might modify an object of type uint32_t
(since its execution makes use of a value of type uint32_t*
), it would not need to allow for the possibility that ptr
might hold its own address. If a compiler could not handle the possibility of the first function accessing a uint32_t
without disabling type-based aliasing entirely, however, it would need to reload ptr
on every iteration of the loop. Almost any compiler(*), with or without type-based aliasing analysis, which is given set_bottom_16_bits_and_advance_v1
and test2
, however, would be required to reload ptr
every time through the loop, reducing to zero any performance benefits type-based aliasing could have offered.
(*) The CompCert C dialect expressly disallows the use of character pointers, or any other pointer-to-integer type, to modify the values of stored pointer object, since making allowance for such accesses would not only degrade performance, but also make it essentially impossible to identify all corner cases that would need to be evaluated to guarantee that the behavior of a compiler's generated machine code will match the specified behavior of the source.
© 2022 - 2025 — McMap. All rights reserved.