Why is the compiler allowed to optimize out this busy waiting loop?
Asked Answered
A

1

8
#include <iostream>
#include <thread>
#include <mutex>

int main()
{
    std::atomic<bool> ready = false;

    std::thread threadB = std::thread([&]() {
        while (!ready) {}

        printf("Hello from B\n");
    });

    std::this_thread::sleep_for(std::chrono::seconds(1));

    printf("Hello from A\n");

    ready = true;

    threadB.join();

    printf("Hello again from A\n");
}

This is an example from the CppCon talk https://www.youtube.com/watch?v=F6Ipn7gCOsY&ab_channel=CppCon (min 17)

The objective is to first print Hello from A then allow threadB to start. It is clear that busy waiting should be avoided because it uses a lot of CPU.

The author said that the while (!ready) {} loop can be optimized (by putting the value of ready into a register) by the compiler because the compiler sees that threadB never sleeps so ready could never be changed. But even if the thread never sleeps another thread could still change the value, right? There is no data race because ready is atomic. The author states that this code is UB. Can somebody explain why the compiler is allowed to do such an optimization?

Alleman answered 28/8, 2021 at 20:25 Comment(5)
I believe the speaker is incorrect, and this loop cannot be optimized away. The whole point of std::atomic is that it can in fact change spontaneously, and the compiler cannot assume otherwise. Removing the loop will very much change the observable behavior of the program, hence not a valid optimization.Alleras
Sleep tells that CPU is not needed for a certain amount of time, and that's it (and has no difference with logic taking the same amount of time), your loop will not get optimized out (removed) unless it's empty.Peer
@Peer The loop is not empty - it calls ready.load()Alleras
I think that since there are 2 choices here ( optimize away the loop, don't optimize away the loop ) this would be a question of whether this is Unspecified Behavior, not Undefined BehaviorFinnigan
Best to think of atomics as preventing slicing and ordering. They are not also implicitly volatile.Neibart
H
10

The author admits in one of the comments below the video that he was wrong:

I had thought so, but it appears I was wrong; the compiler cannot hoist the atomic read out of the loop. The advice at @17:54 is still correct — you should still be very careful and beware of situations where the compiler might reorder or coalesce or eliminate atomic accesses in general — but this particular while-loop is NOT actually such a situation. For some (mostly theoretical) examples of how a compiler might optimize atomic access patterns, see JF Bastien's N4455 "No Sane Compiler Would Optimize Atomics" http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n4455.html

Heroics answered 28/8, 2021 at 20:48 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.