When the C Standard was written, C implementations had at least three different ways they would process something like:
extern int a[5];
int x=a[i];
in cases where i
was outside the range 0..4:
Some implementations use an abstraction model that would add i
to the address of a
, in a manner completely agnostic to whether the resulting address would be within a
, and perform a load from that address with whatever consequences would occur.(Footnote 1)
Some implementations would attempt to trap on accesses outside the range 0..4.
Some implementations would generally behave as in #1, except that if the access to a[i]
was preceded and followed within the same function by accesses to b[0]
and there was no evidence that would particularly suggest that anything between the accesses to b[0]
might be accesses to the storage, a compiler might consolidate the accesses to b[0]
.
Each of these approaches would have advantages for some tasks and disadvantages for others. Rather than trying to imply that all compilers should use the same approach, the Standard opted to allow implementations to select among them, or any other approaches that might be useful, including approaches that weren't yet invented, however they saw fit, on the presumption that implementations would select whatever approach would be most useful for the programmers targeting them. The Standard did this by categorizing as Undefined Behavior situations where #1 and #3 would be observably differnet. (Footnote 2).
Compilers like gcc and clang are designed for tasks which can best be served by identifying
and eliminating code that would only be relevant in situations where the Standard imposes no requirements, and would not benefit from other treatment such as saying that such an array read may at the compiler's leisure be processed by reading the storage, with whatever consequence results, or yielding an arbitrary value in side-effect-free fashion. Such treatment is what you are seeing in this example.
Footnote 1: While C may not provide any means of forcing any particular object to be assigned storage immediately following a
, other languages do offer control over layout, and if a
and some other object extern int b[5];
were forced to be stored consecutively, an access to a[5]
would be an access to b[0]
.
Footnote 2: Some approaches such as #3 would be incompatible with a classification of the behavior as "Implementation Defined". Because the "as-if" rule requires that no observable aspects of program behavior be affected by optimizations except in scenarios categorized as Undefined Behavior, and because the behavior out-of-bounds accesses on implementations using approach #3 would be affected by optimization, it was necessary for the Standard to categorize such accesses as Undefined Behavior.
This was in no way intended to imply that implementations which allowed objects' addresses to be assigned in ways that made approach #1 useful shouldn't continue to support that approach, nor that use of array-access syntax in such fashion wasn't a good way for implementations that allowed precise control over memory layout to allow programmers to exploit such control.
Although the Standard explicitly specifies in its definition of strictly conforming C programs that such programs must not perform any actions characterized as invoking Undefined Behavior, its definition of "conforming C programs" is devoid of such a requirement, and it expressly states that the phraseology "X shall be Y" means nothing more nor less than that execution of a construct when X isn't Y will invoke Undefined Behavior [implying that such execution would be forbidden in strictly conforming C programs, but not in conforming C programs], some people treat the Standard as not recognizing any category of conformance to which many constraints would not apply.
g_ptrArray[ idx ]
to check for NULL, and the compiler can see the array declaration so it knows its size. It would be UB foridx
to be past the end of the array, so the compiler can assume it doesn't happen. – Hangbirdidx
after indexing with it; when it's out-of-bounds it is already too late. – Vanmeternullptr
. Please state the exact compiler and compiler options used. – Professnullptr
, which compilers already support (if you enable-std=c2x
in new-enough GCC: godbolt.org/z/zxYK79TqP). Also, C and C++ agree with each other closely enough on UB to allow compilers to assume that UB hasn't already happened, because there's literally no requirement on behaviour in that situation. – Hangbirdnullptr
. – Professcmp rdx, 664
? Reading optimized disassembly is bad for my sanity... – Professcmp rax, 668
after unrolling by 3. I tried a couple different options but didn't see a664
. Was it perhaps comparing before an increment usinglea
(defeating cmp/jcc fusion), or starting from a non-zero index? – Hangbird-Os
code-gen. It "rotated" the loop and partially peeled the first iteration (loadingg_ptrArray[0]
ahead of the loop withmov rsi, [rax]
, and testing it for null). This is part of "loop inversion", re-arranging awhile
loop so there's a conditional branch at the bottom: Why are loops always compiled into "do...while" style (tail jump)? . Socmp rdx, 664
/ja done
is exiting the loop wheni
= 665 or higher, since it's about to load fromg_arrayPtr[i+1]
. (Clang doesn't optimize away theidx
check) – Hangbirdsum
variable the compiler's alias analysis knows can't be pointed-to by an array member. godbolt.org/z/fbM5Yq9xs) – Hangbirdcmp/ja
is only taken in the case whereidx < ARRAY_LENGTH
is false, which can never happen in a correct (UB-free) program. Reading one past the end of the array if non-null is already a bug. (Although usually silent as long as the array doesn't land at the end of a page.) – Hangbirdnullptr
. This will eliminate the need to check for bounds. – Bureaucratize