The GCC developers put some effort into making the compiler behave "as expected" in these cases. (I wish I could give you a proper reference for this - I remember it coming up on a mailing list or somesuch at some time).
At any rate, something you say:
... does not implement the function as it could using strict aliasing
... implies perhaps a slight misunderstanding of what the strict aliasing rules are for. Your code sample invokes undefined behavior - so any compilation is technically valid, including just a plain ret
or generation of a trap instruction, or even nothing at all (it's legitimate to assume the method can never be called). That newer versions of GCC produce longer/slower code is hardly a deficiency, since producing code that does any particular thing at all would not violate the standard. In fact, the newer versions improve the situation by producing code that does what the programmer probably intended the code to do instead of silently doing something differerent.
What would you rather - that the compiler produces fast code that doesn't do what you want, or slightly slower code that does do what you want?
That being said, I firmly believe that you should not write code that breaks the strict aliasing rules. Relying on the compiler doing the "right" thing when it is "obvious" what is intended is walking a tightrope. Optimisation is hard enough already, without the compiler having to guess at - and make allowances for - what the programmer intended. Further, it's possible to write code which obeys the rules and which can be turned into very efficient object code by the compiler. Indeed the further question can be raised:
Why did earlier versions of GCC behave the way they did, and "optimize" the function by relying on adherence to the strict aliasing rules?
That's a little bit complicated, but is interesting for this discussion (especially in light of suggestions that the compiler is going to some lengths just to break code). Strict aliasing is a component of (or rather, a rule which assists) a process called alias analysis. This process decides whether two pointers alias or not. There are, essentially, 3 possible conditions between any two pointers:
- They MUST NOT ALIAS (the strict aliasing rule makes it easy to deduce this condition, though it can sometimes be deduced in other ways).
- They MUST ALIAS (this requires analysis; value propagation might detect this condition for instance)
- They MAY ALIAS. This is the default condition when neither of the other two conditions can be established.
In the case of the code in your question, strict aliasing implies a MUST NOT ALIAS condition between &acopy
and ptr
(it is trivial to make this determination, because the two values have incompatible types which are not allowed to alias). This condition allows for the optimisation that you then see: all the manipulation of *ptr
values can be discarded because they cannot in theory effect the value of acopy
and they do not otherwise escape the function (which can be determined via escape analysis).
It takes further effort to determine the MUST ALIAS condition between the two pointers. Furthermore, in doing so the compiler would need to ignore (at least temporarily) the previously ascertained MUST NOT ALIAS condition, which means it must spend time attempting to ascertain the truth of a condition which, if everything is as it should be, must be false.
When both MUST NOT ALIAS and MUST ALIAS conditions are determined, we have a case where the code must be invoking undefined behaviour (and we can issue a warning). We then have to decide which condition to keep and which to discard. Because MUST NOT ALIAS, in this case, comes from a constraint which can be (and indeed has been) broken by the user, it is the best option to discard.
So, the older versions of GCC either do not do the requisite analysis to determine the MUST ALIAS condition (perhaps because the opposite MUST NOT ALIAS condition has already been established), or alternatively, the older GCC version opts to discard the MUST ALIAS condition in preference to the MUST NOT ALIAS condition, which leads to faster code which does not do what the programmer most likely intended. In either case, it seems that the newer versions offer an improvement.
main
from the example – Textualrol edi, 16
) ... gcc doesn't seem to know about that – Textualif ( !p->x || !p )
being removed – Textualif
intended to do? If!p
, it'll crash anyway if not optimized. Or is null page accessible in the kernel, so that this code relies on this? – Dovetailif ( !p || !p->x )
and the coder didn't realize there was a difference – Textual