With current C++ compilers you can have atomic support of atomics that are larger than the actual support of your CPU. With x64 you can have atomics that are 16 bytes, but std::atomic also works with larger tuples. Look at this code:
#include <iostream>
#include <atomic>
using namespace std;
struct S { size_t a, b, c; };
atomic<S> apss;
int main()
{
auto ref = apss.load( memory_order_relaxed );
apss.compare_exchange_weak( ref, { 123, 456, 789 } );
cout << sizeof ::apss << endl;
}
The cout above always prints 32 for my platform. But how do these transactions actually work without a mutex ? I don't get any clue from inspecting the disassembly.
If I run the following code with MSVC++:
#include <atomic>
#include <thread>
#include <array>
using namespace std;
struct S { size_t a, b, c, d, e; };
atomic<S> apss;
int main()
{
array<jthread, 2> threads;
auto threadFn = []()
{
auto ref = apss.load( memory_order_relaxed );
for( size_t i = 10'000'000; i--; apss.compare_exchange_weak( ref, { } ) );
};
threads[0] = jthread( threadFn );
threads[1] = jthread( threadFn );
}
There's almost no kernel-time consumed by the code. So the contention actually happens completely in user-space. I guess that's some kind of software transactional memory happening here.
atomic
for "simpler" (e.g.,atomic<SmallObject>
) use-cases. And I have no idea why was that the case – Shorthandstd::atomic<T>
does not imply thatT
is atomic on the hardware level. The point ofstd::atomic<T>
is that you need not know ifT
is atomic on the hardware level. Actually, even ifbool
is atomic for the hardware it is not for C++, but you need to usestd::atomic<bool>
– Crockcout << sizeof ::apss << endl;
– Coercioncout << (apss.is_lock_free() ? "LOCKFREE" : "MUTEX") << "\n";
? – Goodohlock
doesn't imply a mutex, in some implementations it's implemented with a spin lock – Pivotalsizeof
. A hidden member needs to occupy some storage. libstdc++ and libc++ seem to use another solution (hash table of locks indexed by the pointer to an atomic object), as written in the post I linked above. – Hannonatomic<pair<uintptr_t, uintptr_t>>
. Interestingly the OP there also initially guessed that transactional memory was involved. – Roentgenogram