How to ensure some code is optimized away?

Asked 7/8, 2019 at 15:40 Answered 27/9, 2020 at 1:18

c++unit-testing c++11 compiler-optimization dead-code

tl;dr: Can it be ensured somehow (e.g. by writing a unit test) that some things are optimized away, e.g. whole loops?

The usual approach to be sure that something is not included in the production build is wrapping it with #if...#endif. But I prefer to stay with C++ mechanics instead. Even there, instead of complicated template specializations I like to keep implementations simple and argue "hey, the compiler will optimize this out anyway".

Context is embedded SW in automotive (binary size matters) with often poor compilers. They are certified in the sense of safety, but usually not good in optimizations.

Example 1: In a container the destruction of elements is typically a loop:

for(size_t i = 0; i<elements; i++)
    buffer[i].~T();

This works also for build-in types such as int, as the standard allows the explicit call of the destructor also for any scalar types (C++11 12.4-15). In such case the loop does nothing and is optimized out. In GCC it is, but in another (Aurix) not, I saw a literally empty loop in the disassembly! So that needed a template specialization to fix it.

Example 2: Code, which is intended for debugging, profiling or fault-injection etc. only:

constexpr bool isDebugging = false; // somehow a global flag
void foo(int arg) {
    if( isDebugging ) {
        // Albeit 'dead' section, it may not appear in production binary!
        // (size, security, safety...)
        // 'if constexpr..' not an option (C++11)
        std::cout << "Arg was " << arg << std::endl;
    }
    // normal code here...
}

I can look at the disassembly, sure. But being an upstream platform software it's hard to control all targets, compilers and their options one might use. The fear is big that due to any reason a downstream project has a code bloat or performance issue.

Bottom line: Is it possible to write the software in a way, that certain code is known to be optimized away in a safe manner as a #if would do? Or a unit tests, which give a fail if optimization is not as expected?

[Timing tests come to my mind for the first problem, but being bare-metal I don't have convenient tools yet.]

Untread answered 7/8, 2019 at 15:40 Comment(9)

Replace if with if constexpr and the compiler will definitely remove your code – Antre 7/8, 2019 at 15:43

The typical way to guard against this is to use compile time dispatch. Instead of having an unconditional loop, you test T for its characteristics. If it is a built in type, then you never even call the loop so you know it isn't taken. – Maryammaryann 7/8, 2019 at 15:50

@Maryammaryann for the loop that's the current way but falls into my "I would do it simpler" and get away with it, hence the question. – Untread 7/8, 2019 at 15:52

Are you saying that, in the automotive industry, you use compilers that are "exceptionally buggy" yet regardless "certified"? Isn't this incredibly bad news? Do you have a plan to resolve this problem and potentialy save lives? – Pistil 7/8, 2019 at 16:14

Performance is measured with a profiler. You can always measure performance in a unit test if you want to ensure that a particular function don’t take too much time to execute. – Trituration 7/8, 2019 at 18:44

@LightnessRacesinOrbit unfortunately everything is very commercial. A GCC or Clang is superior in a lot of terms, but doesn't have qualification for safety relevant functionality. Some not-done optimizations are probably not priority. – Untread 8/8, 2019 at 8:21

@Untread Missed optimisations are one thing; you suggested there were many outright bugs, though! – Pistil 8/8, 2019 at 13:42

@LightnessRacesinOrbit Well, mainly they are bad in terms of standard-compliance and optimization (just got shocked today). But there are (few) outright bugs sometimes, too! They get reported back and such, but hey, I sometimes get the impression that expensive certified compilers are just taken to be able to sue someone if something bad happens in the field :-(. – Untread 9/8, 2019 at 14:13

@Untread Lol nice – Pistil 9/8, 2019 at 14:16

if constexpr is the canonical C++ expression (since C++17) for this kind of test.

constexpr bool DEBUG = /*...*/;

int main() {
    if constexpr(DEBUG) {
        std::cerr << "We are in debugging mode!" << std::endl;
    }
}

If DEBUG is false, then the code to print to the console won't generate at all. So if you have things like log statements that you need for checking the behavior of your code, but which you don't want to interact with in production code, you can hide them inside if constexpr expressions to eliminate the code entirely once the code is moved to production.

Pulsatory answered 7/8, 2019 at 16:8 Comment(4)

But does this really answer the question if The compilers might be certified but are exceptionally buggy is one reason/concerns why the OP wants to do unit testing? Even though the compiler has to optimize the constexpr away, it might be buggy in the given case and that portion of the code might find its way into the compilation. – Parfait 7/8, 2019 at 16:24

@Parfait If the compiler's behavior cannot be relied upon, then there is no possible answer that would satisfy the OP's requirements. I'm operating under the understanding that "the compilers are buggy" is colloquial synedoche for "I'm worried that what we're doing is non-standard behavior, and thus cannot be relied upon for other environments". If the compilers the OP is using are truly buggy or unreliable, then the only possible advice we could give them is "stop using a buggy compiler". – Pulsatory 7/8, 2019 at 16:29

@Pulsatory That's a nice solution and helpful, but C++17 only and unfortunately we have only C++11. :-( – Untread 8/8, 2019 at 8:12

@Parfait True, you are really safe only when you test it. But it's a nice feature enforcing the compiler's decision similar to template specialization tricks. – Untread 8/8, 2019 at 8:17

There may be a more elegant way, and it's not a unit test, but if you're just looking for that particular string, and you can make it unique,

strings $COMPILED_BINARY | grep "Arg was"

should show you if the string is being included

Byrnie answered 7/8, 2019 at 15:49 Comment(2)

That's not as generic as expected, but a viable idea! You can have any dummy string literal passed into a function or so. If it's optimized out, you know the whole block was optimized out. – Untread 8/8, 2019 at 8:40

Good approach - better than the constexpr approach, because unless I am mistaken even with the constexpr you can not be sure the code is not compiled in: The compiler could still put in some redundant conditional check that loads a constant 0 from memory (which could be the compile-time result of the constexpr) and checks it to be a non-zero value. – Previous 8/8, 2019 at 22:21

Looking at your question, I see several (sub-)questions in it that require an answer. Not all answers might be possible with your bare-metal compilers as hardware vendors don't care that much about C++.

The first question is: How do I write code in a way that I'm sure it gets optimized. The obvious answer here is to put everything in a single compilation unit so the caller can see the implementation.

The second question is: How can I force a compiler to optimize. Here constexpr is a bless. Depending on whether you have support for C++11, C++14, C++17 or even the upcoming C++20, you'll get different feature sets of what you can do in a constexpr function. For the usage:

constexpr char c = std::string_view{"my_very_long_string"}[7];

With the code above, c is defined as a constexpr variable. Because you apply it to the variable, you require some things:

Your compiler should optimize the code so the value of c is known at compile time. This even holds true for -O0 builds!
All functions used for calculate c are constexpr and available. (and by result, enforce the behaviour of the first question)
No undefined behaviour is allowed to be triggered in the calculation of c. (For the given value)

The negative about this is: Your input needs to be known at compile time.

C++17 also provides if constexpr which has similar requirements: condition needs to be calculated at compile time. The result is that 1 branch of the code ain't allowed to be compiled (as it even can contain elements that don't work on the type you are using).

Which than brings us to the question: How do I ensure sufficient optimizations for my program to run fast enough, even if my compiler ain't well behaving. Here the only relevant answer is: create benchmarks and compare the results. Take the effort to setup a CI job that automates this for you. (And yes, you can even use external hardware although not being that easy) In the end, you have some requirements: handling A should take less than X seconds. Do A several times and time it. Even if they don't handle everything, as long as it's within the requirements, its fine.

Note: As this is about debug, you most likely can track the size of an executable as well. As soon as you start using streams, a lot of conversions to string ... your exe size will grow. (And you'll find it a bless as you will immediately find commits which add 10% to the image size)

And than the final question: You have a buggy compiler, it doesn't meet my requirements. Here the only answer is: Replace it. In the end, you can use any compiler to compiler your code to bare metal, as long as the linker scripts work. If you need a start, C++Now 2018: Michael Caisse “Modern C++ in Embedded Systems” gives you a very good idea of what you need to use a different compiler. (Like a recent Clang or GCC, on which you even can log bugs if the optimization ain't good enough)

Advocate answered 7/8, 2019 at 17:21 Comment(0)

Insert a reference to external data or function into the block that should be verified to be optimised away. Like this:

extern void nop();
constexpr bool isDebugging = false; // somehow a global flag
void foo(int arg) {
    if( isDebugging ) {
        nop();
        std::cout << "Arg was " << arg << std::endl; // may not appear in production binary!
    }
    // normal code here...
}

In Debug-Builds, link with an implementation of nop() in a extra compilation unit nop.cpp:

void nop() {}

In Release-Builds, don't provide an implementation. Release builds will only link if the optimisable code is eliminated.

`- kisch

Loy answered 26/9, 2020 at 22:12 Comment(3)

But the compiler doesn't see ’nop’ implementation, so it cannot optimize it out! It will always fail at link-time. – Jez 26/9, 2020 at 22:28

Shame on me. You're right of course. It would work for the OP's second example, though. I put the solution into the first example without thinking. Please allow me to fix my badness. – Loy 26/9, 2020 at 22:41

That is a great idea! When this works, it's my favourite solution, as it's portable and generic. – Untread 30/9, 2020 at 8:38

Here's another nice solution using inline assembly. This uses assembler directives only, so it might even be kind of portable (checked with clang).

constexpr bool isDebugging = false; // somehow a global flag
void foo(int arg) {
    if( isDebugging ) {
        asm(".globl _marker\n_marker:\n");
        std::cout << "Arg was " << arg << std::endl; // may not appear in production binary!
    }
    // normal code here...
}

This would leave an exported linker symbol in the compiled executable, if the code isn't optimised away. You can check for this symbol using nm(1).

clang can even stop the compilation right away:

constexpr bool isDebugging = false; // somehow a global flag
void foo(int arg) {
    if( isDebugging ) {
        asm("_marker=1\n");
        std::cout << "Arg was " << arg << std::endl; // may not appear in production binary!
    }
    asm volatile (
        ".ifdef _marker\n"
        ".err \"code not optimised away\"\n"
        ".endif\n"
    );
    // normal code here...
}

Loy answered 27/9, 2020 at 1:18 Comment(0)

This is not an answer to "How to ensure some code is optimized away?" but to your summary line "Can a unit test be written that e.g. whole loops are optimized away?".

First, the answer depends on how far you see the scope of unit-testing - so if you put in performance tests, you might have a chance.

If in contrast you understand unit-testing as a way to test the functional behaviour of the code, you don't. For one thing, optimizations (if the compiler works correctly) shall not change the behaviour of standard-conforming code.

With incorrect code (code that has undefined behaviour) optimizers can do what they want. (Well, for code with undefined behaviour the compiler can do it also in the non-optimizing case, but sometimes only the deeper analyses peformed during optimization make it possible for the compiler to detect that some code has undefined behaviour.) Thus, if you write unit-tests for some piece of code with undefined behaviour, the test results may differ when you run the tests with and without optimization. But, strictly speaking, this only tells you that the compiler translated the code both times in a different way - it does not guarantee you that the code is optimized in the way you want it to be.

Previous answered 8/8, 2019 at 22:44 Comment(0)

Here's another different way that also covers the first example. You can verify (at runtime) that the code has been eliminated, by comparing two labels placed around it.

This relies on the GCC extension "Labels as Values" https://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html

before:
    for(size_t i = 0; i<elements; i++)
        buffer[i].~T();
behind:
    if (intptr_t(&&behind) != intptr_t(&&before)) abort();

It would be nice if you could check this in a static_assert(), but sadly the difference of &&label expressions is not accepted as compile-time constant.

GCC insists on inserting a runtime comparison, even though both labels are in fact at the same address.

Interestingly, if you compare the addresses (type void*) directly, without casting them to intptr_t, GCC falsely optimises away the if() as "always true", whereas clang correctly optimises away the complete if() as "always false", even at -O1.

Loy answered 26/9, 2020 at 23:29 Comment(1)

That's a good hint for example 1, but unfortunately very specific for one compiler. To have a good feeling about optimization, a check for all compilers we support would be needed. – Untread 1/10, 2020 at 14:59

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags