std::atomic<bool> lock-free inconsistency on ARM (raspberry pi 3)
Asked Answered
C

2

6

I had a problem with a static assert. The static assert was exactly like this:

static_assert(std::atomic<bool>::is_always_lock_free);

and the code failed on Raspberry Pi 3 (Linux raspberrypi 4.19.118-v7+ #1311 SMP Mon Apr 27 14:21:24 BST 2020 armv7l GNU/Linux).

On the cppreference.com atomic::is_always_lock_free reference site it is stated that:

Equals true if this atomic type is always lock-free and false if it is never or sometimes lock-free. The value of this constant is consistent with both the macro ATOMIC_xxx_LOCK_FREE, where defined, with the member function is_lock_free and non-member function std::atomic_is_lock_free.

The first strange thing for me is "sometimes lock-free". What does it depend on? But questions later, back to the problem.

I made a little test. Wrote this code:

#include <iostream>
#include <atomic>

int main()
{
    std::atomic<bool> dummy {};
    std::cout << std::boolalpha
            << "ATOMIC_BOOL_LOCK_FREE --> " << ATOMIC_BOOL_LOCK_FREE << std::endl
            << "dummy.is_lock_free() --> " << dummy.is_lock_free() << std::endl
            << "std::atomic_is_lock_free(&dummy) --> " << std::atomic_is_lock_free(&dummy) << std::endl
            << "std::atomic<bool>::is_always_lock_free --> " << std::atomic<bool>::is_always_lock_free << std::endl;
    return 0;
}

compiled and ran it on raspberry using g++ -std=c++17 atomic_test.cpp && ./a.out (g++ 7.3.0 and 8.3.0, but that shouldn't matter) and got:

ATOMIC_BOOL_LOCK_FREE --> 1
dummy.is_lock_free() --> true
std::atomic_is_lock_free(&dummy) --> true
std::atomic<bool>::is_always_lock_free --> false

As you can see it is not as consistent as stated on the cppreference site... For comparison I ran it on my laptop (Ubuntu 18.04.5) with g++ 7.5.0 and got:

ATOMIC_BOOL_LOCK_FREE --> 2
dummy.is_lock_free() --> true
std::atomic_is_lock_free(&dummy) --> true
std::atomic<bool>::is_always_lock_free --> true

So there is a difference in ATOMIC_BOOL_LOCK_FREE's value and of course the is_always_lock_free constant. Looking for the definition of ATOMIC_BOOL_LOCK_FREE all I could find is

c++/8/bits/atomic_lockfree_defines.h: #define ATOMIC_BOOL_LOCK_FREE  __GCC_ATOMIC_BOOL_LOCK_FREE
c++/8/atomic: static constexpr bool is_always_lock_free = ATOMIC_BOOL_LOCK_FREE == 2;

What is the difference between ATOMIC_BOOL_LOCK_FREE (or __GCC_ATOMIC_BOOL_LOCK_FREE) being equal to 1 or 2? Is it a case where if 1 then it may or may not be lock-free and if 2 it is 100% lock-free? Are there any other values apart from 0? Is this an error on the cppreference site where it is stated that all those return values should be consistent? Which of the results for the raspberry pi output is really true?

Conium answered 7/10, 2020 at 10:23 Comment(0)
C
5

1 means "sometimes lock free" in the standard. But really that means "not known to be lock free at compile time".

Without compiler options, GCC's default baseline includes ARM chips so old that they don't support the necessary instructions for atomic RMWs, so it has to make code that could run on ancient CPUs, always calling libatomic functions instead of inlining atomic operations.

The runtime query function returns true when you run it on an RPi with its ARMv7 or ARMv8 CPU.

With -march=native or -mcpu=cortex-a53 you'd get is_always_lock_free being true, because it's known at compile time that the target machine definitely supports the required instructions. (Those options tell GCC to make a binary that might not run on other / older CPUs.) This was confirmed by the OP in comments.

Without that compile option, std::atomic operations have to call libatomic functions, so there's extra overhead even on a modern CPU.

The way GCC (and all sane compilers) implement std::atomic<T>, it's either lock free for all instances or none, not checking alignment or whatever at runtime per object.

alignof( std::atomic<int64_t> ) is 8 even if alignof( int64_t ) was only 4 on a 32-bit machine, so it's undefined behaviour if you have a misaligned atomic object. (The practical symptoms of that UB could include tearing, i.e. non-atomicity, for pure-load and pure-store.) If you follow C++ rules, all your atomic objects will be aligned; you'd only have a problem if you cast a misaligned pointer to atomic<int64_t> * and tried to use it.

Confucian answered 7/10, 2020 at 23:52 Comment(2)
What about when compiling for RISC-V?Bio
@AaronFranke: IDK if LL/SC instructions are baseline for RISC-V. If so, then is_always_lock_free would be true even without any arch options. You can always try it on godbolt.org with RISC-V GCC or clang.Confucian
B
6

The ATOMIC_xxx_LOCK_FREE macros means:

  • 0​ for the built-in atomic types that are never lock-free
  • 1 for the built-in atomic types that are sometimes lock-free
  • 2 for the built-in atomic types that are always lock-free.

So, in your PI environment, a std::atomic<bool> is sometimes lock-free and the dummy instance you are testing is lock-free - which means that all instances are.

bool std::atomic_is_lock_free( const std::atomic<T>* obj ):

In any given program execution, the result of the lock-free query is the same for all pointers of the same type.

The only downside is that you don't know if the type is lock-free until you run the program.

If(not std::atomic_is_lock_free(&dummy)) {
    std::cout << "Sorry, the program will be slower than expected\n";
}
Benita answered 7/10, 2020 at 10:39 Comment(3)
Most likely, "sometimes lock free" means "not known to be lock free at compile time", but the runtime query function return true once you run it on an RPi with its ARMv7 CPU. Probably with -march=native or -mcpu=cortex-whatever you'd get is_always_lock_free being true. (Without that compile option, std::atomic operations have to call libatomic functions, so there's extra overhead even on a modern CPU). The way GCC (and all sane compilers) implement std::atomic<T>, it's either lock free for all instances or none, not checking alignment or whatever at runtime per instance.Confucian
PeterCordes: That seems reasonable. @Conium Can you try the -m options to see if you get it to be always lock-free?Benita
I added -mcpu=cortex-a53 to compilation and it worked! ATOMIC_BOOL_LOCK_FREE is 2 and std::atomic<bool>::is_always_lock_free evaluates to true! Thanks so much for your help!Conium
C
5

1 means "sometimes lock free" in the standard. But really that means "not known to be lock free at compile time".

Without compiler options, GCC's default baseline includes ARM chips so old that they don't support the necessary instructions for atomic RMWs, so it has to make code that could run on ancient CPUs, always calling libatomic functions instead of inlining atomic operations.

The runtime query function returns true when you run it on an RPi with its ARMv7 or ARMv8 CPU.

With -march=native or -mcpu=cortex-a53 you'd get is_always_lock_free being true, because it's known at compile time that the target machine definitely supports the required instructions. (Those options tell GCC to make a binary that might not run on other / older CPUs.) This was confirmed by the OP in comments.

Without that compile option, std::atomic operations have to call libatomic functions, so there's extra overhead even on a modern CPU.

The way GCC (and all sane compilers) implement std::atomic<T>, it's either lock free for all instances or none, not checking alignment or whatever at runtime per object.

alignof( std::atomic<int64_t> ) is 8 even if alignof( int64_t ) was only 4 on a 32-bit machine, so it's undefined behaviour if you have a misaligned atomic object. (The practical symptoms of that UB could include tearing, i.e. non-atomicity, for pure-load and pure-store.) If you follow C++ rules, all your atomic objects will be aligned; you'd only have a problem if you cast a misaligned pointer to atomic<int64_t> * and tried to use it.

Confucian answered 7/10, 2020 at 23:52 Comment(2)
What about when compiling for RISC-V?Bio
@AaronFranke: IDK if LL/SC instructions are baseline for RISC-V. If so, then is_always_lock_free would be true even without any arch options. You can always try it on godbolt.org with RISC-V GCC or clang.Confucian

© 2022 - 2024 — McMap. All rights reserved.