Can branches with undefined behavior be assumed unreachable and optimized as dead code?

Asked 18/4, 2014 at 11:44 Answered 12/5, 2017 at 21:54

Solved c++language-lawyer undefined-behavior dead-code unreachable-code

Consider the following statement:

*((char*)NULL) = 0; //undefined behavior

It clearly invokes undefined behavior. Does the existence of such a statement in a given program mean that the whole program is undefined or that behavior only becomes undefined once control flow hits this statement?

Would the following program be well-defined in case the user never enters the number 3?

while (true) {
 int num = ReadNumberFromConsole();
 if (num == 3)
  *((char*)NULL) = 0; //undefined behavior
}

Or is it entirely undefined behavior no matter what the user enters?

Also, can the compiler assume that undefined behavior will never be executed at runtime? That would allow for reasoning backwards in time:

int num = ReadNumberFromConsole();

if (num == 3) {
 PrintToConsole(num);
 *((char*)NULL) = 0; //undefined behavior
}

Here, the compiler could reason that in case num == 3 we will always invoke undefined behavior. Therefore, this case must be impossible and the number does not need to be printed. The entire if statement could be optimized out. Is this kind of backwards reasoning allowed according to the standard?

Bilbao answered 18/4, 2014 at 11:44 Comment(11)

sometimes I wonder if users with lots of rep get more upvotes on questions because "oh they have a lot of rep, this must be a good question"... but in this case I read the question and thought "wow, this is great" before I even looked at the asker. – Unhallow 18/4, 2014 at 11:50

I think that the time when the undefined behaviour emerges, is undefined. – Perchance 18/4, 2014 at 11:53

If unreachable code produces undefined behaviour, then that is effectively the same as not having that code in the first place. It would be like having some matches around with no way to ignite them – Brenna 18/4, 2014 at 12:0

The C++ standard explicitly says that an execution path with undefined behavior at any point is completely undefined. I would even interpret it as saying that any program with undefined behavior on path is completely undefined (that includes reasonable results on other parts, but that it not guaranteed). Compilers are free to use the undefined behavior to modify your program. blog.llvm.org/2011/05/what-every-c-programmer-should-know.html contains some nice examples. – Noranorah 18/4, 2014 at 12:26

@Jens: It really means just the executing path. Else you get into troubles over const int i = 0; if (i) 5/i;. – Aftershock 18/4, 2014 at 16:22

The compiler in general cannot prove that PrintToConsole doesn't call std::exit so it has to make the call. – Aftershock 18/4, 2014 at 16:26

@Aftershock good point. PrintToConsole could be replaced with a side-effecting write to a global. I'm interested in all reasonable variations of this problem. Don't restrict yourself to this example if I have chosen it badly. – Bilbao 18/4, 2014 at 16:32

Can you please explain why that statement invokes undefined behavior? – Dumm 18/4, 2014 at 17:9

@MatteoItalia the question you suggested is a subset of this one. It does not cover the backwards reasoning aspect. – Bilbao 18/4, 2014 at 22:29

Raymond Chen on the subject: Undefined behavior can result in time travel (among other things, but time travel is the funkiest) – Doloresdolorimetry 28/6, 2014 at 20:9

Another question with a "time travel" code being cited https://mcmap.net/q/23939/-is-there-any-guarantee-about-whether-code-with-ub-should-be-reachable/57428 – Memnon 20/8, 2014 at 15:10

Does the existence of such a statement in a given program mean that the whole program is undefined or that behavior only becomes undefined once control flow hits this statement?

Neither. The first condition is too strong and the second is too weak.

Object access are sometimes sequenced, but the standard describes the behavior of the program outside of time. Danvil already quoted:

if any such execution contains an undefined operation, this International Standard places no requirement on the implementation executing that program with that input (not even with regard to operations preceding the first undefined operation)

This can be interpreted:

If the execution of the program yields undefined behavior, then the whole program has undefined behavior.

So, an unreachable statement with UB doesn't give the program UB. A reachable statement that (because of the values of inputs) is never reached, doesn't give the program UB. That's why your first condition is too strong.

Now, the compiler cannot in general tell what has UB. So to allow the optimizer to re-order statements with potential UB that would be re-orderable should their behavior be defined, it's necessary to permit UB to "reach back in time" and go wrong prior to the preceding sequence point (or in C++11 terminology, for the UB to affect things that are sequenced before the UB thing). Therefore your second condition is too weak.

A major example of this is when the optimizer relies on strict aliasing. The whole point of the strict aliasing rules is to allow the compiler to re-order operations that could not validly be re-ordered if it were possible that the pointers in question alias the same memory. So if you use illegally aliasing pointers, and UB does occur, then it can easily affect a statement "before" the UB statement. As far as the abstract machine is concerned the UB statement has not been executed yet. As far as the actual object code is concerned, it has been partly or fully executed. But the standard doesn't try to get into detail about what it means for the optimizer to re-order statements, or what the implications of that are for UB. It just gives the implementation license to go wrong as soon as it pleases.

You can think of this as, "UB has a time machine".

Specifically to answer your examples:

Behavior is only undefined if 3 is read.
Compilers can and do eliminate code as dead if a basic block contains an operation certain to be undefined. They're permitted (and I'm guessing do) in cases which aren't a basic block but where all branches lead to UB. This example isn't a candidate unless PrintToConsole(3) is somehow known to be sure to return. It could throw an exception or whatever.

A similar example to your second is the gcc option -fdelete-null-pointer-checks, which can take code like this (I haven't checked this specific example, consider it illustrative of the general idea):

void foo(int *p) {
    if (p) *p = 3;
    std::cout << *p << '\n';
}

and change it to:

*p = 3;
std::cout << "3\n";

Why? Because if p is null then the code has UB anyway, so the compiler may assume it is not null and optimize accordingly. The linux kernel tripped over this (https://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2009-1897) essentially because it operates in a mode where dereferencing a null pointer isn't supposed to be UB, it's expected to result in a defined hardware exception that the kernel can handle. When optimization is enabled, gcc requires the use of -fno-delete-null-pointer-checks in order to provide that beyond-standard guarantee.

P.S. The practical answer to the question "when does undefined behavior strike?" is "10 minutes before you were planning to leave for the day".

Pean answered 18/4, 2014 at 12:48 Comment(5)

Actually, there were quite a few security issues due to this in the past. In particular, any after-the-fact overflow check is in danger of being optimized away due to this. For example void can_add(int x) { if (x + 100 < x) complain(); } can be optimized away entirely, because if x+100 doesn' overflow nothing happens, and if x+100 does overflow, that's U.B. according to the standard, so nothing might happen. – Lemmueu 18/4, 2014 at 14:47

@fgp: right, that's an optimization that people complain about bitterly if they trip over it, because it starts to feel like the compiler is deliberately breaking your code to punish you. "Why would I have written it that way if I wanted you to remove it!" ;-) But I think sometimes it's useful to the optimizer when manipulating larger arithmetic expressions, to assume there's no overflow and avoid anything expensive that would only be needed in those cases. – Pean 18/4, 2014 at 16:40

Would it be correct to say that the program is not undefined if the user never enters 3, but if he enters 3 during an execution the whole execution becomes undefined? As soon as it is 100% certain that the program will invoke undefined behavior (and no sooner than that) behavior becomes allowed to be anything. Are these statements of mine 100% correct? – Bilbao 18/4, 2014 at 17:48

@usr: I believe that's correct, yes. With your particular example (and making some assumptions about the inevitability of the data being processed) I think an implementation could in principle look ahead in buffered STDIN for a 3 if it wanted to, and pack off home for the day as soon as it saw one incoming. – Pean 18/4, 2014 at 17:51

An extra +1 (if I could) for your P.S. – Woodall 18/4, 2014 at 17:53

The standard states at 1.9/4

[ Note: This International Standard imposes no requirements on the behavior of programs that contain undefined behavior. — end note ]

The interesting point is probably what "contain" means. A little later at 1.9/5 it states:

However, if any such execution contains an undefined operation, this International Standard places no requirement on the implementation executing that program with that input (not even with regard to operations preceding the first undefined operation)

Here it specifically mentions "execution ... with that input". I would interpret that as, undefined behaviour in one possible branch which is not executed right now does not influence the current branch of execution.

A different issue however are assumptions based on undefined behaviour during code generation. See the answer of Steve Jessop for more details about that.

Occurrence answered 18/4, 2014 at 11:57 Comment(4)

If taken literally, that is the death sentence for all real programs in existence. – Bilbao 18/4, 2014 at 11:58

I don't think the question was if UB may appear before the code is actually reached. The question, as I understood it, was if the UB may appear if the code would not even be reached. And of course the answer to that is "no". – Deist 18/4, 2014 at 12:2

Well the standard is not so clear about this in 1.9/4, but 1.9/5 can possibly be interpreted as what you said. – Occurrence 18/4, 2014 at 12:4

Notes are non-normative. 1.9/5 trumps the note in 1.9/4 – Aftershock 18/4, 2014 at 16:25

An instructive example is

int foo(int x)
{
    int a;
    if (x)
        return a;
    return 0;
}

Both current GCC and current Clang will optimize this (on x86) to

xorl %eax,%eax
ret

because they deduce that x is always zero from the UB in the if (x) control path. GCC won't even give you a use-of-uninitialized-value warning! (because the pass that applies the above logic runs before the pass that generates uninitialized-value warnings)

Doublure answered 19/4, 2014 at 2:46 Comment(13)

Interesting example. It's rather nasty that enabling optimization hides the warning. This isn't even documented - the GCC docs only say that enabling optimization produces more warnings. – Servant 19/4, 2014 at 12:2

@Servant It is nasty, I agree, but uninitialized value warnings are notoriously difficult to "get right" -- doing them perfectly is equivalent to the Halting Problem, and programmers get weirdly irrational about adding "unnecessary" variable initializations to squelch false positives, so compiler authors wind up being over a barrel. I used to hack on GCC and I recall that everyone was scared to mess with the uninitialized-value-warning pass. – Doublure 12/5, 2014 at 21:48

@zwol: I wonder how much of the "optimization" resulting from such dead-code elimination actually makes useful code smaller, and how much ends up causing programmers to make code bigger (e.g. by adding code to initialize a even if in all circumstances where an uninitialized a would get passed to the function that function would never do anything with it)? – Cineaste 23/4, 2015 at 0:6

@Cineaste I haven't been deeply involved in compiler work in ~10 years, and it's almost impossible to reason about optimizations from toy examples. This type of optimization tends to be associated with 2-5% overall code-size reductions on real applications, if I remember correctly. – Doublure 23/4, 2015 at 0:17

@zwol: That doesn't seem like a whole lot, given that aggressive forms will break code which used to work 100% reliably on anything even resembling the systems for which it was designed (e.g. two's-complement math). I find it curious that compiler writers would favor that sort of optimization in favor of giving programmers more ways of specifying what they do or don't care about (e.g. compute the sum of two integers in cases where it doesn't overflow, or yield an arbitrary value if it does overflow, but don't negate the laws of time and causality in either case). – Cineaste 23/4, 2015 at 0:22

@Cineaste 2-5% is huge as these things go. I've seen people sweat for 0.1%. – Doublure 23/4, 2015 at 0:26

@zwol: I've sweated for to pick up a few bytes on occasion, but would be rather annoyed in such cases if I'd had to waste code to prevent UB in situations where the natural platform behavior would have been benign. I would think that the compilation process could be more efficient if much of the analysis was only done on a step which identified places where assumptions would be helpful and invited the programmer to add __ASSUME* or __REJECT_ASSUMPTION directives [the semantics of the latter being "don't bother me again"], and if such directives were used for optimization. – Cineaste 23/4, 2015 at 0:30

@Cineaste Compiler authors are caught between people like you, and people who say "I have a gigantic white-elephant C++ codebase. I won't be making any changes to it, but I will give you $10,000,000 to make it go fast anyway." No foolin'. – Doublure 23/4, 2015 at 0:35

@zwol: I would think the directives-based approach would be easier and work better in that situation than blindly assuming a program never engages in UB, especially in scenarios where dealing with the natural platform consequences of UB would be much cheaper than ensuring it doesn't occur. Further, many of C's rules are really horribly broken for high-performance computing, like the one that says that char* can alias but nothing else can, or the ones that say that signed types can UB, but unsigned types can't, unless they get promoted to signed types in which case all bets are off. – Cineaste 23/4, 2015 at 0:47

@zwol: I would think that letting programmers specify more about what things will or won't alias, or when deterministic behavior is needed with unsigned types that wrap, could pick up 2-5% with a lot less pain than having formerly-defined behaviors break the laws of time and causality. Further, an ability to say "if any overflow occurs in this stretch of code, I need it to trap in a deterministic fashion, but don't care how much of the code runs first" could generate much faster code than code which needs to examine all arithmetic operands. – Cineaste 23/4, 2015 at 0:48

PS--Returning to your example above, the omission of the if wouldn't observably affect program behavior unless an implementation specified that unintialized variables would receive a particular non-zero value. Variable a could legitimately hold zero, in which case the code would return zero regardless of the value of x even without the compiler having to use Undefined Behavior as a justification. – Cineaste 23/4, 2015 at 0:54

@Cineaste I'm not exactly disagreeing with you, I'm just saying that there's this other, deep-pocketed constituency whose priorities are ~ diametrically opposed. – Doublure 23/4, 2015 at 1:8

@zwol: I'm trying to understand what's going on; is the other constituency more interested in having an excuse for why things don't work than in actually making them work? Such attitudes aren't terribly uncommon, and would explain a lot. As for your example, am I correctly figuring that the behavior is as though a were implicitly initialized to zero, and thus UB is irrelevant as a justification for what the compiler is doing? – Cineaste 23/4, 2015 at 13:58

The current C++ working draft says in 1.9.4 that

This International Standard imposes no requirements on the behavior of programs that contain undeﬁned behavior.

Based on this, I would say that a program containing undefined behavior on any execution path can do anything at every time of its execution.

There are two really good articles on undefined behavior and what compilers usually do:

Noranorah answered 18/4, 2014 at 12:2 Comment(12)

That makes no sense. The function int f(int x) { if (x > 0) return 100/x; else return 100; } certainly never invokes undefined behaviour, even though 100/0 is of course undefined. – Lemmueu 18/4, 2014 at 14:35

@Lemmueu What the standard (especially 1.9/5) says, though, is that if undefined behaviour can be reached, it doesn't matter when it is reached. For example, printf("Hello, World"); *((char*)NULL) = 0 isn't guaranteed to print anything. This aids optimization, because the compiler may freely re-order operations (subject to dependency constraints, of course) that it knows will occur eventually, without having to take undefined behaviour into account. – Lemmueu 18/4, 2014 at 14:40

I would say that a program with your function does not contain undefined behavior, because there is no input where 100/0 will be evaluated. – Noranorah 18/4, 2014 at 17:19

Exactly - so what matters is whether the UB can actually be triggered or not, not whether it can theoretically be triggered. Or are you prepared to argue that int x,y; std::cin >> x >> y; std::cout << (x+y); is allowed to say that "1+1 = 17", just because there are some inputs where x+y overflows (which is UB since int is a signed type). – Lemmueu 18/4, 2014 at 17:28

Formally, I would say that the program has undefined behavior because there exists inputs that trigger it. But you are right that this does not make sense in the context of C++, because it would be impossible to write a program without undefined behavior. I would like it when there was less undefined behavior in C++, but that is not how the language works (and there are some good reasons for this, but they don't concern my daily usage...). – Noranorah 18/4, 2014 at 17:44

Doesn't this depend on the definition of "program" and "contains"? (Which I assume are actually defined in the standard). Because a program execution may not contain UB, the way I would think of "contain", even though a program's code does. – Kidder 18/4, 2014 at 18:9

wait... n.m.'s point is important. If my program were int array[10] and asked a user for an array index, then I accessed it without checking, would it "contain" UB? What if I write a "fast" library without bounds checking and leave that up to the client library? Whether a program contains UB depends on what happens in execution... or else nearly every C program ever written is undefined. – Kidder 18/4, 2014 at 18:10

@djechlin. Yes, that is exactly what I wanted to say. Whether a program contains undefined behavior depends on the execution with the given inputs, partly because the compiler cannot find out if a statement will be executed or not. The C/C++ language allows undefined behavior to allow hardware variability and favor fast execution, as you say in your example of fast library code. However, I would prefer to have less undefined behavior, because hardware is more standardiyed today and the language could rely more on this standard behavior, e.g. integer represented as 2-complement. – Noranorah 18/4, 2014 at 18:20

@fgp: In what cases would a compiler be entitled to assume that a printf will always complete? If code does void foo{int i=printf("hey")/0;} and stdout is sent to a terminal whose buffer is 1 character from full, and which is indefinitely blocked, would the division operation not be required to wait until the text being output was successfully buffered? – Cineaste 23/4, 2015 at 0:16

@Cineaste Well, if printf never returns, what the code that follows it does isn't relevant. So the real question is not whether the compiler may or may not assume that printf returns, but whether it may move stuff from after the printf to before it. It may do that if it can prove that moving things doesn't change the result of the program. But triggering undefined behaviour doesn't cout as a result!. In other words, the compiler simply assumes that UB is never encountered, and may hence move the division to before the printf. – Lemmueu 23/4, 2015 at 12:7

@Cineaste Except that the result of the printf is in your case one of the operands of the division. Meaning the compiler cannot actually move it, because there's a data dependency between the division operation and the printf call. There's still more though - your second operand is a constant, so the compiler may, in fact, infert that the division will always produce UB. And may, in theory, therefore compile this function to whaterver it wants, including system("rm -r/")... In practice, it will probably just emit a warning, though ;-). – Lemmueu 23/4, 2015 at 12:10

@fgp: If printf is allowed to terminate program execution without returning, I would think that the behavior of the program would be well-defined in the event that it did so. While there may be some platforms in which printf is guaranteed to return immediately (on some embedded platforms, I've defined stdout to discard data when the output buffer is full to avoid delaying time-critical main-line code), there are certainly many platforms where the opposite is true. Does the standard say anything in that regard? – Cineaste 23/4, 2015 at 13:41

The word "behavior" means something is being done. A statemenr that is never executed is not "behavior".

An illustration:

*ptr = 0;

Is that undefined behavior? Suppose we are 100% certain ptr == nullptr at least once during program execution. The answer should be yes.

What about this?

 if (ptr) *ptr = 0;

Is that undefined? (Remember ptr == nullptr at least once?) I sure hope not, otherwise you won't be able to write any useful program at all.

No srandardese was harmed in the making of this answer.

Fordone answered 18/4, 2014 at 13:24 Comment(0)

The undefined behavior strikes when the program will cause undefined behavior no matter what happens next. However, you gave the following example.

int num = ReadNumberFromConsole();

if (num == 3) {
 PrintToConsole(num);
 *((char*)NULL) = 0; //undefined behavior
}

Unless the compiler knows definition of PrintToConsole, it cannot remove if (num == 3) conditional. Let's assume that you have LongAndCamelCaseStdio.h system header with the following declaration of PrintToConsole.

void PrintToConsole(int);

Nothing too helpful, all right. Now, let's see how evil (or perhaps not so evil, undefined behavior could have been worse) the vendor is, by checking actual definition of this function.

int printf(const char *, ...);
void exit(int);

void PrintToConsole(int num) {
    printf("%d\n", num);
    exit(0);
}

The compiler actually has to assume that any arbitrary function the compiler doesn't know what does it do may exit or throw an exception (in case of C++). You can notice that *((char*)NULL) = 0; won't be executed, as the execution won't continue after PrintToConsole call.

The undefined behavior strikes when PrintToConsole actually returns. The compiler expects this not to happen (as this would cause the program to execute undefined behavior no matter what), therefore anything can happen.

However, let's consider something else. Let's say we are doing null check, and use the variable after null check.

int putchar(int);

const char *warning;

void lol_null_check(const char *pointer) {
    if (!pointer) {
        warning = "pointer is null";
    }
    putchar(*pointer);
}

In this case, it's easy to notice that lol_null_check requires a non-NULL pointer. Assigning to the global non-volatile warning variable is not something that could exit the program or throw any exception. The pointer is also non-volatile, so it cannot magically change its value in middle of function (if it does, it's undefined behavior). Calling lol_null_check(NULL) will cause undefined behavior which may cause the variable to not be assigned (because at this point, the fact that the program executes the undefined behavior is known).

However, the undefined behavior means the program can do anything. Therefore, nothing stops the undefined behavior from going back in the time, and crashing your program before first line of int main() executes. It's undefined behavior, it doesn't have to make sense. It may as well crash after typing 3, but the undefined behavior will go back in time, and crash before you even type 3. And who knows, perhaps undefined behavior will overwrite your system RAM, and cause your system to crash 2 weeks later, while your undefined program is not running.

Declared answered 18/5, 2014 at 11:47 Comment(3)

All valid points. PrintToConsole is my attempt at inserting a program-external side-effect that is visible even after crashes and is strongly sequenced. I wanted to create a situation where we can tell for sure whether this statement was optimized out. But you are right in that it might never return.; Your example of writing to a global might be subject to other optimizations that are unrelated to UB. For example an unused global can be deleted. Do you have an idea for creating an external side-effect in a way that is guaranteed to return control? – Bilbao 18/5, 2014 at 12:24

Can any outside-world-observable side-effects be produced by code which a compiler would be free to assume returns? By my understanding, even a method which simply reads a volatile variable could legitimately trigger an I/O operation which could in turn immediately interrupt the current thread; the interrupt handler could then kill the thread before it has a chance to perform anything else. I see no justification by which the compiler could push undefined behavior prior to that point. – Cineaste 5/7, 2014 at 20:56

From the standpoint of the C standard, there would be nothing illegal about having Undefined Behavior cause the computer send a message to some people who would track down and destroy all evidence of the program's previous actions, but if an action could terminate a thread, then everything that is sequenced before that action would have to happen before any Undefined Behavior which occurred after it. – Cineaste 5/7, 2014 at 20:59

If the program reaches a statement that invokes undefined behavior, no requirements are placed on any of the program's output/behavior whatsoever; it doesn't matter whether they would take place "before" or "after" undefined behavior is invoked.

Your reasoning about all three code snippets is correct. In particular, a compiler may treat any statement which unconditionally invokes undefined behavior the way GCC treats __builtin_unreachable(): as an optimization hint that the statement is unreachable (and thereby, that all code paths leading unconditionally to it are also unreachable). Other similar optimizations are of course possible.

Mollescent answered 18/4, 2014 at 23:15 Comment(8)

Out of curiosity, when did __builtin_unreachable() start having effects that proceeded both backward and forward in time? Given something like

extern volatile uint32_t RESET_TRIGGER; void RESET(void) { RESET_TRIGGER = 0xAA55; __memorybarrier(); __builtin_unreachable(); }

I could see the builtin_unreachable() as being good to let the compiler know it can omit the return instruction, but that would be rather different from saying that preceding code could be omitted. – Cineaste 25/4, 2015 at 18:10

@Cineaste since RESET_TRIGGER is volatile the write to that location can have arbitrary side effects. To the compiler it's like an opaque method call. Therefore, it cannot be proven (and is not the case) that __builtin_unreachable is reached. This program is defined. – Bilbao 13/5, 2017 at 15:39

@usr: I would think that low-level compilers should treat volatile accesses as opaque method calls, but neither clang nor gcc does so. Among other things, an opaque method call could cause all the bytes of to any object whose address has been exposed to the outside world, and which neither has been nor will be accessed by a live restrict pointer, to be written using an unsigned char*. – Cineaste 15/5, 2017 at 16:8

@usr: If a compiler doesn't treat a volatile access as an opaque method call with regard to accesses to exposed objects, I see no particular reason to expect that it would do so for other purposes. The Standard doesn't require that implementations do so, because there are some hardware platforms where a compiler might be able to know all possible effects from a volatile access. A compiler suitable for embedded use, however, should recognize that volatile accesses might trigger hardware that hadn't been invented when the compiler was written. – Cineaste 15/5, 2017 at 16:13

@Cineaste I think you're right. It seems volatile operations have "no effect on the abstract machine" and can therefore not terminate the program or cause side effects. – Bilbao 20/5, 2017 at 10:43

@usr: The Standard should include a big notice in bold print that it defines what is necessary for an implementation to be conforming, and makes no effort to define all the features and guarantees that may be necessary to make an implementation suitable for any particular purpose. It has for far too long been horribly misconstrued in ways which would have caused it to be soundly rejected if people had any idea how it would be interpreted. – Cineaste 20/5, 2017 at 23:11

@usr: If a compiler is designed to generate code for an unmodified hardware platform where swapping volatile stores with other stores would not be observable, it should not be expected to accommodate the possibility of someone installing bus-monitor hardware to observe such things. If however, a hardware platform supports features like DMA which may access ordinary storage in response to certain trigger actions, an implementation suitable for low-level programming on such hardware must provide some form of sequencing barrier. If implementations have an option to treat volatile accesses... – Cineaste 21/5, 2017 at 17:54

...as such a barrier, low-level code for one such implementation will be usable on all. Having an implementation define other directives and include an option to say "All places where barriers are needed have been marked, so a compiler need not generate any marks elsewhere" may allow code targeting a particular compiler to be optimized more effectively, but if the programmer would be satisfied with the level of optimization that would result from treating all volatile accesses as barriers, why require a programmer to do more work than that? – Cineaste 21/5, 2017 at 17:57

Many standards for many kinds of things expend a lot of effort on describing things which implementations SHOULD or SHOULD NOT do, using nomenclature similar to that defined in IETF RFC 2119 (though not necessarily citing the definitions in that document). In many cases, descriptions of things that implementations should do except in cases where they would be useless or impractical are more important than the requirements to which all conforming implementations must conform.

Unfortunately, C and C++ Standards tend to eschew descriptions of things which, while not 100% required, should nonetheless be expected of quality implementations which don't document contrary behavior. A suggestion that implementations should do something might be seen as implying that those which don't are inferior, and in cases where it would generally be obvious which behaviors would be useful or practical, versus impractical and useless, on a given implementation, there was little perceived need for the Standard to interfere with such judgments.

A clever compiler could conform to the Standard while eliminating any code that would have no effect except when code receives inputs that would inevitably cause Undefined Behavior, but "clever" and "dumb" are not antonyms. The fact that the authors of the Standard decided that there might be some kinds of implementations where behaving usefully in a given situation would be useless and impractical does not imply any judgment as to whether such behaviors should be considered practical and useful on others. If an implementation could uphold a behavioral guarantee for no cost beyond the loss of a "dead-branch" pruning opportunity, almost any value user code could receive from that guarantee would exceed the cost of providing it. Dead-branch elimination may be fine in cases where it wouldn't require giving up anything, but if in a given situation user code could have handled almost any possible behavior other than dead-branch elimination, any effort user code would have to expend to avoid UB would likely exceed the value achieved from DBE.

Cineaste answered 12/5, 2017 at 21:54 Comment(5)

It is a good point that avoiding UB can impose a cost on user code. – Bilbao 13/5, 2017 at 15:44

@usr: It's a point that modernists completely miss. Should I add an example? E.g. if code needs to evaluate x*y < z when x*y doesn't overflow, and in case of overflow yield either 0 or 1 in arbitrary fashion but without side-effects, there is no reason on most platforms why meeting the second and third requirements should be more expensive than meeting the first, but any way of writing the expression to guarantee Standard-defined behavior in all cases would in some cases add significant cost. Writing the expression as (int64_t)x*y < z could more than quadruple the computation cost... – Cineaste 13/5, 2017 at 15:58

...on some platforms, and writing it as (int)((unsigned)x*y) < z would prevent a compiler from employing what might otherwise have been useful algebraic substitutions (e.g. if it knows that x and z are equal and positive, it could simplify the original expression to y<0, but the version using unsigned would force the compiler to perform the multiply). If the compiler can guarantee even though the Standard doesn't mandate it, it will uphold the "yield 0 or 1 with no side-effects" requirement, user code could give the compiler optimization opportunities it could not otherwise get. – Cineaste 13/5, 2017 at 16:6

Yeah, it seems some milder form of undefined behavior would be helpful here. The programmer could turn on a mode that causes x*y to emit a normal value in case of overflow but any value at all. Configurable UB in C/C++ seems important to me. – Bilbao 14/5, 2017 at 9:7

@usr: If the authors of the C89 Standard were being truthful in saying that the promotion of short unsigned values to signed in was the most serious breaking change, and were not ignorant fools, that would imply that they expected that in cases where platforms had been defining useful behavioral guarantees, implementations for such platforms had been making those guarantees available to programmers, and programmers had been exploiting them, compilers for such platforms would continue to offer such behavioral guarantees whether the Standard ordered them to or not. – Cineaste 15/5, 2017 at 15:59

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags