GCC removes a bounds check in the right operand of &&, but not in the left operand, why?
Asked Answered
W

5

32

I have the following C/C++ code snippet:

#define ARRAY_LENGTH 666

int g_sum = 0;
extern int *g_ptrArray[ ARRAY_LENGTH ];

void test()
{
    unsigned int idx = 0;

    // either enable or disable the check "idx < ARRAY_LENGTH" in the while loop
    while( g_ptrArray[ idx ] != nullptr /* && idx < ARRAY_LENGTH */ )
    {
        g_sum += *g_ptrArray[ idx ];
        ++idx;
    }

    return;
}

When I compile the above code using GCC compiler in version 12.2.0 with the option -Os for both cases:

  1. the while loop condition is g_ptrArray[ idx ] != nullptr
  2. the while loop condition is g_ptrArray[ idx ] != nullptr && idx < ARRAY_LENGTH

I get the following assembly:

test():
        ldr     r2, .L4
        ldr     r1, .L4+4
.L2:
        ldr     r3, [r2], #4
        cbnz    r3, .L3
        bx      lr
.L3:
        ldr     r3, [r3]
        ldr     r0, [r1]
        add     r3, r3, r0
        str     r3, [r1]
        b       .L2
.L4:
        .word   g_ptrArray
        .word   .LANCHOR0
g_sum:
        .space  4

As you can see the assembly does !NOT! do any checking of the variable idx against the value ARRAY_LENGTH.


My question

How is that possible? How can the compiler generate the exactly same assembly for both cases and ignore the idx < ARRAY_LENGTH condition if it is present in the code? Explain me the rule or procedure, how the compiler comes to the conclusion that it can completely remove the condition.

The output assembly shown in Compiler Explorer (see both assemblies are identical):

  1. the while condition is g_ptrArray[ idx ] != nullptr:

  2. the while condition is g_ptrArray[ idx ] != nullptr && idx < ARRAY_LENGTH:

NOTE: If I swap the order of conditions to be idx < ARRAY_LENGTH && g_ptrArray[ idx ] != nullptr, the output assembly contains the check for value of idx as you can see here: https://godbolt.org/z/fvbsTfr9P.

Writein answered 27/10, 2023 at 11:6 Comment(19)
You just accessed g_ptrArray[ idx ] to check for NULL, and the compiler can see the array declaration so it knows its size. It would be UB for idx to be past the end of the array, so the compiler can assume it doesn't happen.Hangbird
See also - examples of UB and optimizationCallup
There is no point in validating idx after indexing with it; when it's out-of-bounds it is already too late.Vanmeter
You should be aware that "undefined behavior means anything can happen including but not limited to the program giving your expected output. But never rely on the output of a program that has UB. The program may just crash."Chindwin
What is "C/C++"? Are you compiling as C or as C++? C does not as of yet support nullptr. Please state the exact compiler and compiler options used.Profess
@Lundin: To be fair, they did link the builds on Godbolt with ARM GCC 12.2 in C++ mode. And the upcoming ISO C23 standard has nullptr, which compilers already support (if you enable -std=c2x in new-enough GCC: godbolt.org/z/zxYK79TqP). Also, C and C++ agree with each other closely enough on UB to allow compilers to assume that UB hasn't already happened, because there's literally no requirement on behaviour in that situation.Hangbird
o I'm not really opposed to [c][c++] tags on this question, but agreed the question should say explicitly what they tested, not just via Godbolt links. Because without already knowing the answer, you can't know it's not important whether you compiled as C++ or C23.Hangbird
@PeterCordes C and C++ have different rules about linkage and about initialization of variables at file scope, so it isn't just about nullptr.Profess
@PeterCordes On a different note, could you explain why x86-64 clang does cmp rdx, 664? Reading optimized disassembly is bad for my sanity...Profess
@Lundin: In what build? In godbolt.org/z/3hrYr4co9 (x86-64 clang -O3), it's doing cmp rax, 668 after unrolling by 3. I tried a couple different options but didn't see a 664. Was it perhaps comparing before an increment using lea (defeating cmp/jcc fusion), or starting from a non-zero index?Hangbird
@PeterCordes godbolt.org/z/8GsGf7ha4 So some of the instructions before that comparison are just (seemingly random) unrolling?Profess
@Lundin: It's not unrolling in that -Os code-gen. It "rotated" the loop and partially peeled the first iteration (loading g_ptrArray[0] ahead of the loop with mov rsi, [rax], and testing it for null). This is part of "loop inversion", re-arranging a while loop so there's a conditional branch at the bottom: Why are loops always compiled into "do...while" style (tail jump)? . So cmp rdx, 664/ ja done is exiting the loop when i = 665 or higher, since it's about to load from g_arrayPtr[i+1]. (Clang doesn't optimize away the idx check)Hangbird
(I simplified the asm by using a local sum variable the compiler's alias analysis knows can't be pointed-to by an array member. godbolt.org/z/fbM5Yq9xs)Hangbird
@PeterCordes I see, that might make sense. Although the gcc "ISO compliant death station" optimization of just dropping one comparison from the code will of course be way faster for the branch prediction.Profess
@Lundin: Well, faster just from overall pipeline throughput from fewer instructions. And on CPUs that have trouble predicting more than 1 branch per 16-byte block or something like that, but even such CPUs, it's usually not a problem if the branch is never taken. Which is the case here, since cmp/ja is only taken in the case where idx < ARRAY_LENGTH is false, which can never happen in a correct (UB-free) program. Reading one past the end of the array if non-null is already a bug. (Although usually silent as long as the array doesn't land at the end of a page.)Hangbird
Another recent example question of people wondering about the compiler optimizing away a path that would be undefined behavior: Is (x<y)==(-x>-y) true or false?Schuller
real example github.com/openssl/openssl/issues/12143Mashburn
If you are concerned with tight performance, consider making the array one entry larger than you need it and ensure that the last element is always nullptr. This will eliminate the need to check for bounds.Bureaucratize
"GCC removes a bounds check in the right operand of &&, but not in the left operand, why?" - Because it can. The C++ rules around short-circuiting and UB allows the compiler to determine that the check is not necessary in the rightmost part of the expression, so it didn't do it. In general you want your compiler to be clever like that - it enables faster code. It's on you to know all the language rules and write correct code. Yes, C++ is difficult, That's just a fact, but what you get in return is super fast code - just know the rules and write religiously correct code at all times.Postfree
K
60

Accessing an array out of bounds is undefined behavior so the compiler can assume that it never happens in the LHS of the && expression. It is then jumping through hoops (optimizations) to notice that since ARRAY_LENGTH is the length of the array, the RHS condition must necessarily hold true (otherwise UB would ensue in the LHS). Hence the result you see.

The correct check would be idx < ARRAY_LENGTH && g_ptrArray[idx] != nullptr. This would avoid any possibility of undefined behavior on the RHS since the LHS has to be evaluated first, and the RHS is not evaluated unless the LHS is true (in C and C++ the && operator is guaranteed to behave this way).

Even potential undefined behavior can do nasty things like that!

Katalin answered 27/10, 2023 at 11:13 Comment(23)
A blatant case of compiler perversion: instead of warning about an obvious programmer error, the compiler optimizes code away silently. I wish the compiler folks would add -Wdont-assume-the-programmer-is-always-rightDi
Even potential undefined behavior can do nasty things like that! - an easier way to express that is: "The compiler can assume doesn't UB in your program". (The C++ standard is clear that this is retroactive, that later code can imply value-ranges for earlier code if there isn't a branch that would avoid the UB. ISO C is not clear on that: see Does undefined behaviour retroactively mean that earlier visible side-effects aren't guaranteed? which compares the standards)Hangbird
Potential UB isn't an issue. Only UB that will certainly happen is an issue.Ifill
@Ifill the UB shown in the code does not "certainly happen" though. It can happen at runtime given the right conditions, but it won't "certainly" happen. However, it is still an issue.Katalin
In this case, it's UB that would have necessarily already happened for idx < ARRAY_LENGTH to be reachable with an idx that makes the comparison false. That's just a matter of terminology in how to describe things; I think; the concept seems relatively simple. It gets trickier if you had code like if (x < 0) *ptr = 1; arr[x] = 2; where arr is an actual array, not just a pointer, so the compiler can be sure that a negative index would be out of bounds and thus UB.Hangbird
The trickier question is whether the if() *ptr conditional assignment has to happen along a path of execution that will inevitably reach UB, even if *ptr is a volatile access. (See the question I linked in my first comment, and an ISO C committee proposal linked in comments there, which proposes to answer it differently from how ISO C++ allows compilers to treat it: open-std.org/jtc1/sc22/wg14/www/docs/n3128.pdf - concrete UB: visible side-effects have to happen if sequenced before UB is actually reached.)Hangbird
Sorry, this is mostly a nitpick, as Peter said a matter of terminology. "can happen at runtime given the right conditions" I mean that there's a certain moment at runtime when the UB will inevitably happen as the program continues, that's when the adverse effects kick in. As long as it remains merely "potential", the program doesn't break; it's impossible to notice that the condition was removed from the loop without causing UB.Ifill
@Di "Obvious" how exactly? Do you want compilers to warn about every possible elidable computation? If not, can you delineate precisely when it is an error and when it is intentional?Ology
@PasserBy: the incorrect order of the tests g_ptrArray[idx] != nullptr and idx < ARRAY_LENGTH should be obvious in a code review. Conversely, the fact that idx < ARRAY_LENGTH is redundant if g_ptrArray[idx] != nullptr is assumed to be legitimate is non obvious to human readers of the code. The compiler assumes that the code writer knows that g_ptrArray[idx] != nullptr can be performed, hence assumes that idx is in the proper range and deduces that the second test is redundant. This is perverse: if the programmer was shrewd enough to correctly assume idx is always ...Di
... in the proper range, they surely would not have written the redundant test. Conversely, if they made a mistake and wrote the test in the wrong order, flagging the redundant code as such would help fix a blatant bug. When compilers become smart enough to detect such redundancy, this level of analysis should profit the programmer, and help detect programming errors, not make debugging harder than it already is.Di
@Di Yes it's obvious in code review, but that's why we have them. To the compiler, there are redundancies everywhere, particularly when functions are inlined or when macro functions are used (e.g. bounds checks in different functions are redundant). By the time the optimizer starts working, it is operating on intermediate representations and has already lost most of what makes it obvious.Ology
@PasserBy: I am well aware of the difficulty to detect programmers' mistakes automatically, but as we all know, correctness is more important than performance. Compiler writers would be well advised to put more effort in detecting potential incorrect code and default to warning about them.Di
@Di you are just questioning the fundamental principles of C and C++. They are made for the hypothetical developer “who knows what they are doing” or “knows better than the compiler”. While I’d agree with you regarding the design philosophy, I don’t see a point in discussing it in the context of such a language while alternative languages exist. As Passer By said, the code might not have been written in this form literally. At the same time, compiler writers try to implement warnings for cases that look like a mistake. Would be interesting to check which compilers do already warn here.Rosenzweig
Note the correct check is only well defined because the left hand side of && is guaranteed to be evaluated first by C and C++'s "short circuiting" rules.Ethanethane
@chqrlie, perhaps you can help make that happen? I'm pretty sure that the GCC maintainers would welcome patches that help improve the diagnostics emitted, if they don't generate false positives and don't hurt other desirable attributes too much (e.g. compilation speed on correct code and quality of compiled output).Parch
@TobySpeight: point taken. I am not sure if I can tackle the task for gcc or clang: I'm allergic to the GNU indentation style and trying to stay away from C++ too. but I shall take a look and try.Di
@Holger: A more fundamental principle of the language the C Standard was chartered to describe is that if machine code for some platform could satisfy application requirements without performing some operation, neither the programmer nor the compiler should have to generate code that performs that operation. If on some platform reading any byte within 1024 bytes of the end of an object could be guaranteed to yield some value without side effect, and if most invocations of a loop would exit before reaching the final index value, code which inspects items before testing the index would...Qktp
...perform one fewer bounds check in cases where the loop exited early, and one extra value check in cases where the loop bound was reached, than code which tests the index first. The benefits of this approach may become even greater when unrolling a loop, if e.g. the loop would be required to run at least 251 times before exiting because of its boundary condition. Normally, if one unrolled a loop 5x, one would need to separately handle the last 0-4 iterations separately, but if padding out loop execution to the next multiple of five wouldn't matter, there'd be no need to special-case that.Qktp
@Qktp that sounds even more like a language solely designed for the hypothetical developer “who knows what they are doing”. Besides that, I don’t see how compiler vendors and application developers could ever agree “if machine code for some platform could satisfy application requirements” (regardless of whether with or without performing some operation) when only one side actually knows the application requirements.Rosenzweig
@Holger: Compilers seeking to observe the Principle of Least Astonishment should treat a construct like sum+=a[x] as meaning "either perform the address computation, load, and addition using the machine steps implied by the operation, or substitute some other sequence of steps that would also be consistent with application requirements". A compiler needn't know anything about application requirements to follow that recipe, but a compiler which is told (e.g. via compilation options) what kinds of low level behavioral details are and are not important may be able to follow it more efficiently.Qktp
@Holger: Older versions of the language specification simply indicated that code would behave as though the operations were treated as a sequence of sequential steps. Allowing flexibility to deviate from that sequence in cases where doing so would not interfere with the tasks to be accomplished made the language more useful in cases where compilers sought to avoid interfering with what programmers were doing.Qktp
@Holger: Or, to put things more simply, compilers are entitled to assume that the behavior resulting from sequentially processing all of the individual steps of a program in a stand-alone manner characteristic of the environment, without regard for earlier or later steps, would satisfy application requirements; they can thus satisfy application requirements without having to know or care what they are. C's reputation for speed in the 1980s came about largely because programmers could use it as a 'high level assembly language", and the C Standard Commitee's charter recognizes such usage.Qktp
An answer like this should have citations in the body of the post.Unsuitable
P
7

The C standard (C17 6.5.6 §8) says that we may not do pointer arithmetic outside an array nor access it beyond the array - doing so is undefined behavior and anything can happen.

Therefore, strictly speaking, the array out of bounds check is superfluous since your loop condition goes as "stop upon spotting a null pointer within the array". And in case g_ptrArray[ idx ] is an out of bounds access, you invoke undefined behavior so the program would in theory be toasted at this point. So there's no need to evaluate the right operand of &&. (As you probably know, && has strict left-to-right evaluation.) The compiler can assume that the access is always inside the array of known size.

We can smack the compiler back in line by adding something that makes it impossible for the compiler to predict the code:

int** I_might_change_at_any_time = g_ptrArray;
void test2()
{
    unsigned int idx = 0;
     

    // check for idx value is NOT present in code
    while( I_might_change_at_any_time[ idx ] != nullptr && idx < ARRAY_LENGTH)
    {
        g_sum += *g_ptrArray[ idx ];
        ++idx;
    }
}

Here the pointer acts as "middle man". It's a file scope variable with external linkage and so it may change at any time. The compiler can no longer assume that it always points at g_ptrArray. It will now be possible for the left operand of && to be a well-defined access. And therefore gcc now adds the out of bounds check to the assembler:

        cmp     QWORD PTR [rdx+rax*8], 0
        je      .L6
        cmp     eax, 666
        je      .L6
Profess answered 27/10, 2023 at 11:48 Comment(18)
“The C standard (C17 6.5.6 §8) says that we may not do pointer arithmetic outside an array”. You know this, but there’s one exception: we can calculate a pointer one element past the end of an array, and compare it to a pointer within the array. That is, p < &array[0] + ARRAY_SIZE for p between &array[0] and endp is valid.Kelvin
@Davislor: The C Standard waives jurisdiction over how compiler writers process pointer arithmetic outside an array, so as to allow them to process it in whatever manner would best serve the needs of their customers, who were expected to be the programmers that would be writing code for use with the compilers. The only things that would be forbidden within a "conforming C program" would be constructs that would compel a conforming C implementation to reject it.Qktp
@Qktp That is not correct. The Standard specifically allows it: “If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.”Kelvin
"It's a file scope variable with external linkage and so it may change at any time". That doesn't follow, and in fact here it cannot change during the loop; doing so would be either be a data race or a violation of strict aliasing.Dotti
@Davislor: "If a ‘‘shall’’ or ‘‘shall not’’ requirement that appears outside of a constraint or runtime-constraint is violated, the behavior is undefined. Undefined behavior is otherwise indicated in this International Standard by the words ‘‘undefined behavior’’ or by the omission of any explicit definition of behavior. There is no difference in emphasis among these three; they all describe ‘‘behavior that is undefined’’.Qktp
@Davislor: If one fixes the last clause to make it non-recursive, it would say they all describe "...behavior upon which the Standard imposes no requirements". The Standard expressly provides that implementations may process constructs which invoke UB "...in a documented manner characteristic of the environment", and the Rationale makes clear an intention that many implementations extend the semantics of the language by doing precisely that.Qktp
@supercat: No one is arguing with you about what an implementation may do with UB. Davidslor is correctly pointing out that it's not the pointer arithmetic here that is UB, it's the lvalue-to-rvalue conversion (or C's equivalent). g_ptrArray + idx is fine even when idx == ARRAY_LENGTH, but *(g_ptrArray + idx) (assuming it is in an evaluated context and not the operand of &) is not.Dotti
@Qktp The clause immediately before the one you quoted (4 §1) explains why you’re wrong.Kelvin
@Davislor: The only programs to which 4.1 refers are strictly conforming programs, i.e. those which are 100% portable. When the Standard specifies that UB may occur as a result of non-portable or erroneous program constructs, that phrase includes constructs which are non-portable without being erroneous/Qktp
@Davislor: How does the Standard characterize constructs which the authors expected most implementations to process in a consistent and useful fashion, but which they recognized might be impractical for some implementations to process in a manner that would obey the "as if rule" requirements for defined program executions?Qktp
@Qktp Under your interpretation, it would also be undefined behavior to do pointer arithmetic within the array. If your reading were true of the second clause of the or statement (“point to elements of the same array object, or one past the last element of the array object”), it would also apply to the first. And no, there’s no clever technicality, error in the standard or ambiguity here. That’s not what it says.Kelvin
@Davislor: Do you have an answer for my last question? So far as I can tell, the as-if rule would require that any action which might sometimes cause an implementation to observably deviate from sequential program behavior by characterized as UB, even if 99% of implementations were expected to process it consistently 100% of the time, and the weird 1% were expected to deviate from specified behavior only in ways that would rarely be adverse to program requirements.Qktp
@Davislor: My reading of the Standard says that it waives jurisdiction over how implementations process out of bounds array accesses, allowing them to treat such accesses in whatever manner would best help their users accomplish what needs to be done. How is that inconsistent with anything else the Standard says?Qktp
@Davislor: And I'm still awaiting an answer to the question about how the Standard characterizes actions that most implementations would process identically, but which some implementations might be unable to process in a manner which was always consistent with the as-if rule's requirements for defined program executions.Qktp
@Qktp Your question is irrelevant and based on a misunderstanding. The Standard says, plain as day, that, under these conditions, pointer arithmetic “shall not” fail. If you re-read part 4, you will see that this is a constraint on what the implementation is and is not allowed to do under some condition. A sentence like this never creates undefined behavior. (In other contexts, “shall not” might mean something else, which is why there are multiple definitions for “shall not” in part 4.) I am not going to waste any more time on this.Kelvin
Slight pedantry: I_might_change_at_any_time can't change "at any time" in the C memory model. But it certainly could change between initialisation an the call to test2(), which is what's relevant here.Parch
@Qktp As I previously told you, I don’t want to spend any more time on this discussion..Kelvin
@TobySpeight Maybe we’re supposed to imagine it’s volatile?Kelvin
D
5

The explanation: as documented by Marco Bonelli, the compiler assumes that the programmer who wrote the first test g_ptrArray[idx] != nullptr knows that this code has defined behavior, hence it assumes that idx is in the proper range, ie: idx < ARRAY_LENGTH. It follows that the next test && idx < ARRAY_LENGTH is redundant has idx is already known to be < ARRAY_LENGTH, hence the code can be elided.

It is unclear where this range analysis occurs and whether the compiler could also warn the programmer about a redundant test elided, the way it flags potential programming errors in if (a = b) ... or a = b << 1 + 2;


What makes this *optimisation* perverse IMHO is the lack of a warning for a non obvious optimisation.

Unlike compilers, programmers are human and make mistakes. Even 10x programmers sometimes make stupid mistakes, and compilers should not assume that programmers are always right, unless they explicitly boast about it as in if ((a = b)) ...

The incorrect order of the tests g_ptrArray[idx] != nullptr and idx < ARRAY_LENGTH should be obvious in a code review. Conversely, the fact that idx < ARRAY_LENGTH is redundant if g_ptrArray[idx] != nullptr is assumed to be legitimate is non obvious to human readers of the code. The compiler assumes that the programmer knows that g_ptrArray[idx] != nullptr can be performed, hence assumes that idx is in the proper range and deduces that the second test is redundant. This is perverse: if the programmer was shrewd enough to correctly assume idx is always in the proper range, they surely would not have written the redundant test. Conversely, if they made a mistake and wrote the test in the wrong order, flagging the redundant code as such would help fix a blatant bug.

When compilers become smart enough to detect such redundancy, this level of analysis should profit the programmer, and help detect programming errors, not make debugging harder than it already is.

Di answered 28/10, 2023 at 8:4 Comment(11)
Hopefully tools like gcc -fsanitize=undefined (or perhaps Valgrind?) would be able to catch the out-of-bounds read in g_ptrArray[idx] != nullptr. But only if you had a test-case where all the pointers were non-null, so you'd have to already be looking for possible errors in loop termination to catch it. Good point about the lack of warning. To me, this optimization does seem obvious, but you're right, I wouldn't have written it that way in the first place (at least not on purpose) because I know you need to check things before you use them at all.Hangbird
@PeterCordes: Such tools can only help if they are enabled when a program receives inputs that are contrived to yield the relevant corner cases. If application requirements would be satisfied, even for malicious inputs, by a function whose behavior is "if x is less than 5, yield a[x], and otherwise yield an arbitrary value without side effects", then a function which simply returns a[x] may be transformed by one version of a compiler into machine code that satisfies application requirements 100% of the time, but by a later compiler may require that the code be written less efficiently...Qktp
...in order to satisfy application requirements.Qktp
@supercat: in the general case array overread isn't safe: it could page-fault on a system with memory protection (depending on how far, and how close the end of the array was to an unmapped page). Or on a system without memory protection, it could generate the address of an MMIO register, and thus have a side-effect. But sure, if like in this case your x has some known constraints like being at most 1 past the end, then that amount of over-read wouldn't be a problem in asm as long as there's something after it in the .data section. (Or you could pad the array with a null pointer.)Hangbird
@PeterCordes: A programmer will often know more than the compiler about which memory overreads would be safe or unsafe in the target environment. If the most efficient means of accomplishing a task would sometimes perform memory overreads which the target environment would be specified to process without side effects, a compiler that offers no way to specify a sequence of operations that may result in such overreads will be incapable of generating optimal machine code for that environment. If a compiler's generated code will never be used in any environment where an attempt to write...Qktp
...an object with the value it already holds would be distinguishable from a no-op, then a transform like replacing if (x) *p += x; with *p += x; would be valid, but should be recognized as limiting the range of execution environments for which the compiler would be suitable.Qktp
I don't see any attempt at answering the question.Render
@CarstenS: The first two paragraphs are a concise answer. If the answer stopped there, it would have value for its brevity. But the rant about C is misplaced and actively makes this much worse.Whensoever
@MSalters: I actually liked the second part of this answer; prior to reading that, I'd had a hard time seeing the point-of-view of people who didn't want their compilers to optimize their buggy code. Like, of course compilers are going to optimize if you write it this way, that's what compilers are for. But the point about a lack of warning, and that nobody would have written it that way (if they knew what they were doing) are well-taken; some other language potholes do have warnings for things that would be dangerous, like always-true loop conditions like unsigned > 0Hangbird
@MSalters: (But counterpoint, real code might not look like this. It could be logically equivalent after some inlining and simplification, in cases where we do want a bounds-check to optimize away because we know this array is null-terminated, and are expressing that to the compiler by unconditionally dereferencing before running some iterator-increment function. That's probably the kind of case compiler devs were thinking of when implementing optimizations like this. e.g. related to optimizing "safe" containers in C++ that do bounds-checks on every access.)Hangbird
This isn't the place to discuss this, but here we go... I have a toy example here: godbolt.org/z/x3TEe84Gf I also would like a diagnostic in foobar, which is obviously nonsense. However, the possibly dangerous part is baz, because the programmer may have expected the check in foo to apply. On the other hand, removing the check in bar is imo what one would want. So what is the compiler supposed to do? These things aren't easy.Render
E
4

I want to emphasize, since other answers haven't, that the order of execution of the array lookup and the bounds test doesn't matter. If you had written

    bool in_bounds = idx < ARRAY_LENGTH;
    if ( g_ptrArray[ idx ] != nullptr && in_bounds ) { ... }

then the bounds test could still be dropped, even though it's sequenced before the array lookup. Even if you made in_bounds a volatile bool, GCC could still drop the test and write a constant true to it.

What matters is whether the code g_ptrArray[ idx ] executes. If it always executes, as above, then idx can be assumed to be in bounds even in earlier uses (if there's no intervening assignment to it, of course). The bounds check isn't dropped in ( idx < ARRAY_LENGTH && g_ptrArray[ idx ] != nullptr ) because &&'s short-circuiting behavior means g_ptrArray[ idx ] doesn't always run.

Embayment answered 29/10, 2023 at 6:9 Comment(0)
Q
-2

When the C Standard was written, C implementations had at least three different ways they would process something like:

extern int a[5];
int x=a[i];

in cases where i was outside the range 0..4:

  1. Some implementations use an abstraction model that would add i to the address of a, in a manner completely agnostic to whether the resulting address would be within a, and perform a load from that address with whatever consequences would occur.(Footnote 1)

  2. Some implementations would attempt to trap on accesses outside the range 0..4.

  3. Some implementations would generally behave as in #1, except that if the access to a[i] was preceded and followed within the same function by accesses to b[0] and there was no evidence that would particularly suggest that anything between the accesses to b[0] might be accesses to the storage, a compiler might consolidate the accesses to b[0].

Each of these approaches would have advantages for some tasks and disadvantages for others. Rather than trying to imply that all compilers should use the same approach, the Standard opted to allow implementations to select among them, or any other approaches that might be useful, including approaches that weren't yet invented, however they saw fit, on the presumption that implementations would select whatever approach would be most useful for the programmers targeting them. The Standard did this by categorizing as Undefined Behavior situations where #1 and #3 would be observably differnet. (Footnote 2).

Compilers like gcc and clang are designed for tasks which can best be served by identifying and eliminating code that would only be relevant in situations where the Standard imposes no requirements, and would not benefit from other treatment such as saying that such an array read may at the compiler's leisure be processed by reading the storage, with whatever consequence results, or yielding an arbitrary value in side-effect-free fashion. Such treatment is what you are seeing in this example.

Footnote 1: While C may not provide any means of forcing any particular object to be assigned storage immediately following a, other languages do offer control over layout, and if a and some other object extern int b[5]; were forced to be stored consecutively, an access to a[5] would be an access to b[0].

Footnote 2: Some approaches such as #3 would be incompatible with a classification of the behavior as "Implementation Defined". Because the "as-if" rule requires that no observable aspects of program behavior be affected by optimizations except in scenarios categorized as Undefined Behavior, and because the behavior out-of-bounds accesses on implementations using approach #3 would be affected by optimization, it was necessary for the Standard to categorize such accesses as Undefined Behavior.

This was in no way intended to imply that implementations which allowed objects' addresses to be assigned in ways that made approach #1 useful shouldn't continue to support that approach, nor that use of array-access syntax in such fashion wasn't a good way for implementations that allowed precise control over memory layout to allow programmers to exploit such control.

Although the Standard explicitly specifies in its definition of strictly conforming C programs that such programs must not perform any actions characterized as invoking Undefined Behavior, its definition of "conforming C programs" is devoid of such a requirement, and it expressly states that the phraseology "X shall be Y" means nothing more nor less than that execution of a construct when X isn't Y will invoke Undefined Behavior [implying that such execution would be forbidden in strictly conforming C programs, but not in conforming C programs], some people treat the Standard as not recognizing any category of conformance to which many constraints would not apply.

Qktp answered 29/10, 2023 at 17:16 Comment(5)
Footnote 1 is misleading. The real-world worst-case scenario at the time was memory-mapped hardware. An out of bounds read could do literally anything. Returning a value from actual memory was the most benign of outcomes.Whensoever
Footnote 2 is also misleading. The as-if rule does not require that. The as-if rule only requires that an optimizing compiler compiles the source code according to the abstract rules of the Standard and the Implementation-Defined Behavior. There is no rule whatsoever that the optimizing compiler has to somehow match a non-optimizing compiler. In particular, Unspecified Behavior can and will vary heavily depending on optimization (an out of bounds read is not Unspecified Behavior - see previous comment)Whensoever
None of this rant actually answers the question asked here.Dirac
@MSalters: My point with footnote 1 was that there would be situations where a programmer would know that the computed address was meaningful, even if a compiler couldn't. The main-paragraph text "whatever consequences would occur" accommodated the possibility of memory mapped I/O.Qktp
@user3840170: The compiler is using the Standard's laxity surrounding the evaluation of g_ptrArray[ idx ] to generate code that skips the if in cases where idx would be ARRAY_LENGTH or greater. The Standard says that implementations may opt to impose a constraint on idx, and some choose to do so, but such practice is hardly universal; a choice to e.g. use K&R behavior (#1 above)is not a defect. N1570 section 4 "Conformance" paragraph 2 indicates that the Standard uses the phrase "X shall be Y" as shorthand for "The Standard only exercises jurisdiction when X is Y".Qktp

© 2022 - 2025 — McMap. All rights reserved.