"Observable behaviour" and compiler freedom to eliminate/transform pieces c++ code

Asked 12/7, 2011 at 12:55 Answered 12/7, 2011 at 14:17

After reading this discussion I realized that I almost totally misunderstand the matter :)

As the description of C++ abstract machine is not rigorous enough(comparing, for instance, with JVM specification), and if a precise answer isn't possible I would rather want to get informal clarifications about rules that reasonable "good" (non-malicious) implementation should follow.

The key concept of part 1.9 of the Standard addressing implementation freedom is so called as-if rule:

an implementation is free to disregard any requirement of this Standard as long as the result is as if the requirement had been obeyed, as far as can be determined from the observable behavior of the program.

The term "observable behavior", according to the standard (I cite n3092), means the following:

— Access to volatile objects are evaluated strictly according to the rules of the abstract machine.

— At program termination, all data written into files shall be identical to one of the possible results that execution of the program according to the abstract semantics would have produced.

— The input and output dynamics of interactive devices shall take place in such a fashion that prompting output is actually delivered before a program waits for input. What constitutes an interactive device is implementation-defined.

So, roughly speaking, the order and operands of volatile access operations and io operations should be preserved; implementation may make arbitrary changes in the program which preserve these invariants (comparing to some allowed behaviour of the abstract c++ machine)

Is it reasonable to expect that non-malicious implementation treates io operations wide enough (for instance, any system call from user code is treated as such operation)? (E.g. RAII mutex lock/unlock wouldn't be thrown away by compiler in case RAII wrapper contains no volatiles)
How deeply the "behavioral observation" should immerse from user-defined c++ program level into library/system calls? The question is, of course, only about library calls that not intended to have io/volatile access from the user viewpoint (e.g. as new/delete operations) but may (and usually does) access volatiles or io in the library/system implementation. Should the compiler treat such calls from the user viewpoint (and consider such side effects as not observable) or from "library" viewpoint (and consider the side effects as observable) ?
If I need to prevent some code from elimination by compiler, is it a good practice not to ask all the questions above and simply add (possibly fake) volatile access operations (wrap the actions needed to volatile methods and call them on volatile instances of my own classes) in any case that seems suspicious?
Or I'm totally wrong and the compiler is disallowed to remove any c++ code except of cases explicitly mentioned by the standard (as copy elimination)

Indiraindirect answered 12/7, 2011 at 12:55 Comment(4)

Re: #4, the compiler is definitely allowed to remove large chunks of code, for example the optimizer may replace for( int i = 0; i < 40000; i++ ) j += 2; by j += 80000;, and this is the rule that allows that. If you were trying to use a loop to introduce a delay in your program, too bad, the delay isn't observable behavior according to the as-if rule. – Remembrancer 12/7, 2011 at 13:5

In other words, busy loopers and spin lockers beware. Busy loops and spin locks are pure evil, and this rule makes them useless. – Koon 12/7, 2011 at 13:44

@David Hammen: most textbooks show a spin-wait when introducing volatile. Such optimizations is exactly why volatile was introduced. – Nada 12/7, 2011 at 13:57

@presnel: Unfortunately, many compilers do not optimize volatiles properly. Just because the standards says they must do so does not mean they do do so. Gcc, for example, is not compliant. I've posted papers before. I'll see if I can dig them up again. – Koon 12/7, 2011 at 14:16

The important bit is that the compiler must be able to prove that the code has no side effects before it can remove it (or determine which side effects it has and replace it with some equivalent piece of code). In general, and because of the separate compilation model, that means that the compiler is somehow limited as to what library calls have observable behavior and can be eliminated.

As to the deepness of it, it depends on the library implementation. In gcc, the C standard library uses compiler attributes to inform the compiler of potential side effects (or absence of them). For example, strlen is tagged with a pure attribute that allows the compiler to transform this code:

char p[] = "Hi there\n";
for ( int i = 0; i < strlen(p); ++i ) std::cout << p[i];

into

char * p = get_string();
int __length = strlen(p);
for ( int i = 0; i < __length; ++i ) std::cout << p[i];

But without the pure attribute the compiler cannot know whether the function has side effects or not (unless it is inlining it, and gets to see inside the function), and cannot perform the above optimization.

That is, in general, the compiler will not remove code unless it can prove that it has no side effects, i.e. will not affect the outcome of the program. Note that this does not only relate to volatile and io, since any variable change might have observable behavior at a later time.

As to question 3, the compiler will only remove your code if the program behaves exactly as if the code was present (copy elision being an exception), so you should not even care whether the compiler removes it or not. Regarding question 4, the as-if rule stands: If the outcome of the implicit refactor made by the compiler yields the same result, then it is free to perform the change. Consider:

unsigned int fact = 1;
for ( unsigned int i = 1; i < 5; ++i ) fact *= i;

The compiler can freely replace that code with:

unsigned int fact = 120; // I think the math is correct... imagine it is

The loop is gone, but the behavior is the same: each loop interaction does not affect the outcome of the program, and the variable has the correct value at the end of the loop, i.e. if it is later used in some observable operation, the result will be as-if the loop had been executed.

Don't worry too much on what observable behavior and the as-if rule mean, they basically mean that the compiler must yield the output that you programmed in your code, even if it is free to get to that outcome by a different path.

EDIT

@Konrad raises a really good point regarding the initial example I had with strlen: how can the compiler know that strlen calls can be elided? And the answer is that in the original example it cannot, and thus it could not elide the calls. There is nothing telling the compiler that the pointer returned from the get_string() function does not refer to memory that is being modified elsewhere. I have corrected the example to use a local array.

In the modified example, the array is local, and the compiler can verify that there are no other pointers that refer to the same memory. strlen takes a const pointer and so it promises not to modify the contained memory, and the function is pure so it promises not to modify any other state. The array is not modified inside the loop construct, and gathering all that information the compiler can determine that a single call to strlen suffices. Without the pure specifier, the compiler cannot know whether the result of strlen will differ in different invocations and has to call it.

Habile answered 12/7, 2011 at 13:19 Comment(10)

I’ve just noticed that I don’t understand how the pure attribute helps the compiler elide this call in general. What if the content of the string p is manipulated inside the loop? Can the compiler still elide strlen calls? – Imprimis 12/7, 2011 at 13:24

@Konrad: You are right, I edited the question. In the original version the compiler cannot know for sure if some other code is accessing/modifying the pointed memory. – Romanic 12/7, 2011 at 13:50

"In gcc, the C standard library uses ...". According to the C and C++ standards, the standard library is part of the implementation. It's subject to the same rules as the compiler. Together they may make assumptions about each other. The mentioned mechanism is one way to encode such intra-implementation assumptions. Another compiler could in fact assume the full semantics of strlen. – Roice 12/7, 2011 at 14:21

@MSalters: I have actually just found that during testing. In g++, if the pointer is to a local array and the pointer is not handed to any function, with -O3 the compiler does not call strlen even once. Anyway, with respect to the original question the result is that at the end of the day it does not matter, as the compiler is required to maintain the behavior of the code. – Romanic 12/7, 2011 at 14:37

@David: To summarize your answer (please correct me): 1)If c++ program has any logical reason to expect some side effect (as in case of expllicit system call such mutex entering), then decent compiler should expect a possible side effect too. 2)If compiler can proove the absence of side effect it may remove the corresponding code – Indiraindirect 12/7, 2011 at 20:31

@David: (...but the following is not still clear for me) 3)If c++ programmer may suspect some side effect inside library implementation but this side effect obviously shouldn't be taken into account from the c++ program logic viewpoint - this is a border case (?). Imagine some sorting algorithm that needs extra memory and use malloc() inside.It is pure from the algorithmic viewpoint, but, strictly speaking, has side effect. Should the compiler take such side effect into account? I would suppose that it should be implementatio dependent, but I'm interesting in your (and others) opinion – Indiraindirect 12/7, 2011 at 20:35

@user396672: The compiler must guarantee that the behavior is as-if the code was executed. In general, memory allocations and release will not be elided by the compiler, but it would not break standard conformance if it managed to do without it. A compiler in a platform where the stack is unlimited might decide to remap malloc to alloca if it can guarantee that to all effects they are equivalent. I don't know of any compiler that performs that optimization, and the fact is that it would probably be quite hard to guarantee that the behavior is not changed under any circumstance. – Romanic 13/7, 2011 at 7:33

@user396672: From a practical point of view you should not really care: only if the result is guaranteed to be the same, the operation can be elided, which means that whether it is elided or not it would not affect your program outcome at all. I would not expect any compiler in the near future to remove memory allocations. – Romanic 13/7, 2011 at 7:35

@David Rodríguez - dribeas : Considering the high cost of memory allocations (including thread safety), I'd disagree with your expectations. JVM's already do replace heap memory allocations with stack allocations. – Roice 13/7, 2011 at 16:28

@DavidRodríguez-dribeas When you edited your answer way back on July 12 2011, you updated the "from" code but not the "to" code in the transformation. I think "char * p = get_string();" needs to be changed to "char * p ="Hi there\n";" Do you agree? – Danndanna 8/7, 2021 at 15:53

The abstract machine defined by the standard will, given a specific input, produce one of a set of specific output. In general, all that is guaranteed is that for that specific input, the compiled code will produce one of the possible specific output. The devil is in the details, however, and there are a number of points to keep in mind.

The most important of these is probably the fact that if the program has undefined behavior, the compiler can do absolutely anything. All bets are off. Compilers can and do use potential undefined behavior for optimizing: for example, if the code contains something like *p = (*q) ++, the compiler can conclude that p and q aren't aliases to the same variable.

Unspecified behavior can have similar effects: the actual behavior may depend on the level of optimization. All that is requires is that the actual output correspond to one of the possible outputs of the abstract machine.

With regards to volatile, the stadnard does say that access to volatile objects is observable behavior, but it leaves the meaning of "access" up to the implementation. In practice, you can't really count much on volatile these days; actual accesses to volatile objects may appear to an outside observer in a different order than they occur in the program. (This is arguably in violation of the intent of the standard, at the very least. It is, however, the actual situation with most modern compilers, running on a modern architecture.)

Most implementations treat all system calls as “IO”. With regards to mutexes, of course: as far as C++03 is concerned, as soon as you start a second thread, you've got undefined behavior (from the C++ point of view—Posix or Windows do define it), and in C++11, synchronization primatives are part of the language, and constrain the set of possible outputs. (The compiler can, of course, elimiate the synchronizations if it can prove that they weren't necessary.)

The new and delete operators are special cases. They can be replaced by user defined versions, and those user defined versions may clearly have observable behavior. The compiler can only remove them if it has some means of knowing either that they haven't been replaced, of that the replacements have no observable behavior. In most systems, replacement is defined at link time, after the compiler has finished its work, so no changes are allowed.

With regards to your third question: I think you're looking at it from the wrong angle. Compilers don't “eliminate” code, and no particular statement in a program is bound to a particular block of code. Your program (the complete program) defines a particular semantics, and the compiler must do something which produces an executable program having those semantics. The most obvious solution for the compiler writer is to take each statement separately, and generate code for it, but that's the compiler writer's point of view, not yours. You put source code in, and get an executable out; but lots of statements don't result in any code, and even for those that do, there isn't necessarily a one to one relationship. In this sense, the idea of “preventing some code elimination” doesn't make sense: your program has a semantics, specified by the standard, and all you can ask for (and all that you should be interested in) is that the final executable have those semantics. (Your fourth point is similar: the compiler doesn't “remove” any code.)

Cretic answered 12/7, 2011 at 14:17 Comment(3)

I agree with your reasoning about question 3 as the answer on question 1 is true. nonoverloaded new/delete are more interesting. They apparently ar't pure and arn't reference transparent as should return diferent values for subsequent identical calls. Theefore, they have side effects. But from he program logic viewpoint the only return value is interesting.. – Indiraindirect 12/7, 2011 at 21:10

The issues concerning new and delete are complex. The standard versions do not have “observable behavior”, so the number and order of calls to them can be changed, as long as there is no other change to observable behavior. But the user can replace them with versions which do have observable behavior. (The issue is observable behavior, not side effects.) – Cretic 13/7, 2011 at 8:40

Your thesis about standard and overloaded new has an interesting consequent: the presence of OB depends on our scope rather than implementation details. OB for new presents in both cases (standard and overloaded) if we consider the whole system "program+library" (since standard new obviously sometimes make system calls). But it's not OB for our program as isolated unit since the program should not expect any OB from the new call according to some implicit contract. Seems that explicit interface contract would be better (like gcc's pure, but with weaker semantics) – Indiraindirect 15/7, 2011 at 11:29

I can't speak for what the compilers should do, but here's what some compilers actually do

#include <array>
int main()
{
    std::array<int, 5> a;
    for(size_t p = 0; p<5; ++p)
        a[p] = 2*p;
}

assembly output with gcc 4.5.2:

main:
     xorl    %eax, %eax
     ret

replacing array with vector shows that new/delete are not subject to elimination:

#include <vector>
int main()
{
    std::vector<int> a(5);
    for(size_t p = 0; p<5; ++p)
        a[p] = 2*p;
}

assembly output with gcc 4.5.2:

main:
    subq    $8, %rsp
    movl    $20, %edi
    call    _Znwm          # operator new(unsigned long)
    movl    $0, (%rax)
    movl    $2, 4(%rax)
    movq    %rax, %rdi
    movl    $4, 8(%rax)
    movl    $6, 12(%rax)
    movl    $8, 16(%rax)
    call    _ZdlPv         # operator delete(void*)
    xorl    %eax, %eax
    addq    $8, %rsp
    ret

My best guess is that if the implementation of a function call is not available to the compiler, it has to treat it as possibly having observable side-effects.

Camellia answered 12/7, 2011 at 13:18 Comment(1)

Operators new and delete are special cases. The compiler can only eliminate calls to them if it can somehow prove that the versions used are those in the standard library; that the operators have not been replaced by user defined versions. – Cretic 12/7, 2011 at 13:43

1. Is it reasonable to expect that non-malicious implementation treates io operations wide enough

Yes. Assuming side-effects is the default. Beyond default, compilers must prove things (except for copy-elimination).

2. How deeply the "behavioral observation" should immerse from user-defined c++ program level into library/system calls?

As deep as it can. Using current standard C++ the compiler can't look behind library ^{with meaning of static library, i.e. calls that target a function inside some ".a-" or ".lib file"} calls, so side effects are assumed.

Using the traditional compilation model with multiple object files, the compiler is even unable to look behind extern calls. Optimizations accross units of compilation may be done at link-time though.

Btw, some compilers have an extension to tell it about pure functions. From the gcc documentation:

Many functions have no effects except the return value and their return value depends only on the parameters and/or global variables. Such a function can be subject to common subexpression elimination and loop optimization just as an arithmetic operator would be. These functions should be declared with the attribute pure. For example,

      int square (int) __attribute__ ((pure));

says that the hypothetical function square is safe to call fewer times than the program says. Some of common examples of pure functions are strlen or memcmp. Interesting non-pure functions are functions with infinite loops or those depending on volatile memory or other system resource, that may change between two consecutive calls (such as feof in a multithreading environment).

Thinking about poses an interesting question to me: If some chunk of code mutates a non-local variable, and calls an un-introspectible function, will it assume that this extern function might depend on that non-local variable?

compilation-unit A:

int foo() {
    extern int x;
    return x;
}

compilation-unit B:

int x;
int bar() {
    for (x=0; x<10; ++x) {
        std::cout << foo() << '\n'; 
    }
}

The current standard has a notion of sequence points. I guess if a compiler does not see enough, it can only optimize as far as to not break the ordering of dependent sequence points.

3. If I need to prevent some code from elimination by compiler

Except by looking at the object-dump, how could you judge whether something was removed?

And if you can't judge, than is this not equivalent to the impossibility of writing code that depends on its (non-)removal?

In that respect, compiler extensions (like for example OpenMP) help you in being able to judge. Some builtin mechanisms exist, too, like volatile variables.

Does a tree exist if nobody can observe it? ^{Et hop, we are at quantum mechanics.}

4. Or I'm totally wrong and the compiler is disallowed to remove any c++ code except of cases explicitly mentioned by the standard (as copy elimination)

No, it is perfectly allowed so. It is also allowed to transform code like it's a piece of slime. (with the exception of copy elimination, you couldn't judge anyways).

Nada answered 12/7, 2011 at 13:50 Comment(2)

3."As deep as it can". Hmm.. I'm almost sure that any function call may lead to page swapping, processor heating, battery consuming and system shutdown as possible side effects :) Simple acessing an element of large (statically allocated) array may cause the system to run out of phisical memory. We need to stop somewere in our observations of side effects... (I accept your 1 and 3 as the answer on question 1 is true ) – Indiraindirect 12/7, 2011 at 20:52

I should have added "within the rules of standard C++". E.g., a read of an allocated variable is guaranteed to never fail, even though you present interesting observations of possible real world side effects. +1 – Nada 13/7, 2011 at 3:56

One difference is that Java is designed to run on one platform only, the JVM. That makes it much easier to be "rigorous enough" in the specification, as there is only the platform to consider and you can document exactly how it works.

C++ is designed to be able to run on a wide selection of platforms and do that natively, without an intervening abstraction layer, but use the underlying hardware functionality directly. Therefore it has chosen to allow the functionality that actually exist on different platforms. For example, the result of some shift operations like int(1) << 33 is allowed to be different on different system, because that's the way the hardware works.

The C++ standard describes the result you can expect from your program, not the way it has to be achieved. In some cases it says that you have to check you particular implementation, because the results may differ but still be what is expected there.

For example, on an IBM mainframe nobody expects floating point to be IEEE compatible because the mainframe series is much older that the IEEE standard. Still C++ allows the use of the underlying hardware while Java does not. Is that an advantage or a disavantage for either language? It depends!

Within the restrictions and allowances of the language, a reasonable implementation must behave as if it did like you have coded in your program. If you do system calls like locking a mutex, the compiler has the options of not knowing what the calls do and therefore cannot remove them, or do know exactly what they do and therefore also know if they can be removed or not. The result is the same!

If you do calls to the standard library, the compiler can very well know exactly what the call does, as this is described in the standard. It then has the option of really calling a function, replace it with some other code, or skip it entirely if it has no effect. For example, std::strlen("Hello world!") can be replaced by 12. Some compilers do that, and you will not notice.

Souse answered 12/7, 2011 at 14:10 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

1. Is it reasonable to expect that non-malicious implementation treates io operations wide enough

2. How deeply the "behavioral observation" should immerse from user-defined c++ program level into library/system calls?

3. If I need to prevent some code from elimination by compiler

4. Or I'm totally wrong and the compiler is disallowed to remove any c++ code except of cases explicitly mentioned by the standard (as copy elimination)

Recommended topics

Hot tags