Can and does the compiler optimize out two atomic loads? [duplicate]

Asked 24/1, 2017 at 5:20 Answered 26/1, 2017 at 13:46

Solved c++gcc x86 atomic compiler-optimization

Will the two loads be combined to one in such scenarios? If this is architecture dependent, what would be the case in say modern processors from say Intel? I believe atomic loads are equivalent to normal loads in Intel processors.

void run1() {
    auto a = atomic_var.load(std::memory_order_relaxed);
    auto b = atomic_var.load(std::memory_order_relaxed);
   // Some code using a and b;
}

void run2() {
    if (atomic_var.load(std::memory_order_relaxed) == 2 && /*some conditions*/ ...) {
         if (atomic_var.load(std::memory_order_relaxed) * somevar > 3) {
               /*...*/
         }
    }
}

run1() and run2() are simply two scenarios using two loads of the same atomic variable. Can the compiler collapse such scenarios of two loads into one load and reuse that?

Noreen answered 24/1, 2017 at 5:20 Comment(2)

Sorry, why do you have both run1 and run2? Can you be more specific in your question? – Anamorphosis 24/1, 2017 at 5:23

@Brian Simply two scenarios. Edited. – Noreen 24/1, 2017 at 5:33

Can the compiler optimize away atomic loads?

Your implementation of run1() can be safely optimized to

void run1() {
    auto a = atomic_var.load(std::memory_order_relaxed);
    auto b = a;
   // Some code using a and b;
}

In the original program the two loads could possibly be adjacent to each other in the total order of accesses on atomic_var every time run1() is called. In that case the adjacent load() operations would return the same result.

Since that possibility cannot be excluded, the compiler is allowed to optimize away the second load(). This can be done for any memory order argument, not just for relaxed atomics.

For run2() it depends. You didn't specify /*some conditions*/. If there's something, that might have a visible side effect on the atomic variable (like an opaque function call or accessing a volatile variable, etc.) then this cannot be optimized away. Otherwise it might be possible.

Does the compiler optimize out two atomic loads?

Depends on your compiler. And possibly on the compiler options you passed in. Possibly it depends on your platform. There is some debate going on, on whether compilers should optimize atomics. There is N4455 No Sane Compiler Would Optimize Atomics and this video as a start on the topic.

GCC and clang don't optimize the two load() operations onto one at the moment.

Alessandraalessandria answered 26/1, 2017 at 13:46 Comment(5)

So, the answer as of now is the standard allows, but the modern compilers don't do it and it is still a subject of discussion. – Noreen 27/1, 2017 at 7:36

That's probably correct. At least in this case they don't optimize (much). Maybe they do in other cases. Surely, modern optimizing compilers perform reorderings of stores and loads around atomics as far as their memory orderings permit it. For a C++ programmer the C++ standard is the interface to code against. Relying on whether the compiler optimizes something or not is usually not a good idea, since bugs can be introduced with any new compiler update otherwise. – Alessandraalessandria 27/1, 2017 at 9:14

A recent question asked the same thing about coalescing repeated stores. My answer there is essentially the same as your answer here, including linking N4455 (but also wg21.link/p0062). Also some stuff about why compilers choose not to, until this is sorted out. – Doggerel 2/9, 2017 at 6:57

Anyway, so I closed this as a duplicate of the new one. – Doggerel 2/9, 2017 at 7:4

"There is some debate going on, on whether compilers should optimize atomics" but there is no debate that the optimization is valid in such simple case – Stagemanage 14/12, 2018 at 0:31

Neither GCC (6.3) nor Clang (3.9) currently optimizes the two loads into one.

The only way to know is to look at the generated assembly: https://godbolt.org/g/nZ3Ekm

Ulani answered 24/1, 2017 at 6:1 Comment(4)

Is it allowed by the standard? Doesn't volatile say that you need to load everytime? Why does that count for atomic. – Noreen 24/1, 2017 at 7:31

@themagicalyang: std::atomic does not have to use volatile, and it should be possible to combine the two loads under memory_order_relaxed. You could file a report about this on GCC's bug tracker and see what they say. Be sure to link to it here if you do! – Ulani 24/1, 2017 at 7:46

@themagicalyang: If you want a volatile std::atomic<int> , then use that. But you left out the definition of atomic_var so we have no reason to assume it's volatile. – Hazan 24/1, 2017 at 11:22

Lots of information on this can be found on JF Bastien's excellent post: open-std.org/jtc1/sc22/wg21/docs/papers/2015/n4455.html – Mistranslate 24/1, 2017 at 14:34

Can the compiler optimize away atomic loads?

Does the compiler optimize out two atomic loads?

Recommended topics

Hot tags