Why isn't atomic double fully implemented
Asked Answered
V

2

32

My question is quite simple. Why isn't std::atomic<double> implemented completely? I know it has to do with atomic RMW (read-modify-write) access. But I really don't see, why this shouldn't be possible on a double.

It's specified that any trivially copyable type can be used. And of course double is among them. So C++11 requires the basic operations (load, store, CAS, exchange, etc.) that you can use with any class type.

However, on integers an extra set of operations is possible (fetch_add, ++, +=, etc).

A double differs very little from these types. It's native, trivially copyable, etc. Why didn't the standard include the double with these types?


Update: C++20 does specialize std::atomic<T> for floating-point types, with fetch_add and sub. C++20 std::atomic<float>- std::atomic<double>.specializations But not atomic absolute-value (AND) or negate (XOR).

Editor's note: Without C++20 you can roll your own out of CAS; see Atomic double floating point or SSE/AVX vector load/store on x86_64 for portable examples; atomic<double> and float are lock-free on most C++ implementations.

Vaules answered 5/5, 2015 at 9:2 Comment(15)
I'd guess the reason is that most CPUs don't support atomic double operations. So how would you implement it?Favorite
Is double even trivially copyable in the strictest sense seeing how its memory storage is not identical with register storage on most current CPUs? But even if it is, does it make sense to perform atomic operations on floating point data at all? You could probably say "sure, why not?", but it's hard for me to imagine why one would want to do any such thing. Integers and pointers and double pointers type-punned as large integers, sure. But floating point data?Lorenlorena
@Lorenlorena Well this might sound naive, but if i have a double value inside a class. And i want to write it on 1 thread and read it on another. Atomic, to me seems the way to go.Vaules
I do that kind of thing by submitting a "task" (which is really just a struct of a function pointer and a void pointer) to a queue. The other thread pulls tasks from the queue and invokes the function pointer. The void pointer points to "whatever data", presumably an input and output buffer, or in your case one or several double values, whichever is needed for "task". Once done, the worker thread posts the pointer to data onto the "results" queue from which the main thread can pull them.Lorenlorena
@Lorenlorena That sounds like a custom implementation of the std::future, and std::promise framework right? In my particular case I have a double value on the worker thread that changes every once so often. And i want to cout this on the main thread. It's not really necessary to always have the very latest value. And std::atomic has the least overhead for this. If I'm not mistaken?Vaules
In that particular case, std::atomic is the most desirable thing, yes. But expect it to be implemented with a mutex, I'm pretty sure it's not lockfree on any mainstream architecture (this would very much surprise me). I wonder whether it might be worth punning the double into an int64_t and back. Not precisely the nicest thing to do, but this will be lockfree on (almost) every platform, and the cost to convert between floating point and integer is probably more or less equivalent to a mutex on fast path (spinning) and a few dozen times faster when congested (syscall).Lorenlorena
@Lorenlorena Well, std::atomic_is_lock_free actually says atomic double is lock free for the msvc2013 compiler:). So at least in this particular case it seems to be lock free:).Vaules
@Damon: The standard doesn't care about registers. "Trivially copyable" is just about copying from one memory location to another using memcpy (simply put). The architecture would have to be very strange if it didn't support that.Vieira
@Damon: You seem to be talking about obsolete x87 with 80-bit registers. Even if you are compiling without SSE2, fld qword / fstp qword is a copy for double. Converting to 80-bit internal format and back will never change the result, because every double can be exactly represented in that format. (denormals are normalized when loading, but the extra exponent range always allows it to exactly represent the value).Ceporah
x87 doesn't have flush-to-zero or denormals-are-zero settings, and rounding mode doesn't come into play (because every double can be exactly represented). The only thing that could munge a double is if the x87 precision mode was set to 24-bit (float) mantissa. Fun fact, gcc uses fild/fistp to implement std::atomic<int64_t> load/store on 32-bit x86. (from/to integer avoids raising FP exceptions). fld can raise exceptions if they're unmasked.Ceporah
But anyway, SSE2 is baseline for 64-bit, and a lot of 32-bit software is built with SSE2 enabled. In that case, there's absolutely no weirdness. Either way, std::atomic<double> is lock-free on gcc/clang/msvc. stackoverflow.com/questions/45055402/…Ceporah
Related: Atomic double floating point or SSE/AVX vector load/store on x86_64Ceporah
@curiousguy: "trivially copyable", not "copiable" was the correct spelling. I only mention this in case you were going to edit anything else to make the same change. The rest of your edit looks reasonable. "interlocked" was asking about the underlying implementation being possible using x86 Windows terminology which is sort of ok, but sure, HW support for atomic RMW is the same thing.Ceporah
@PeterCordes Sorry for the spelling copyable which didn't spell checked (I added it to perso dict now!). interlocked has only 1 reference: "Interlocked Class on MSDN".Nick
@curiousguy: see learn.microsoft.com/en-us/windows/win32/api/winnt/… and friends, like InterlockedXor64Ceporah
L
23

std::atomic<double> is supported in the sense that you can create one in your program and it will work under the rules of C++11. You can perform loads and stores with it and do compare-exchange and the like.

The standard specifies that arithmetic operations (+, *, +=, &, etc.) are only provided for atomics of "integral types", so an std::atomic<double> won't have any of those operations defined.

My understanding is that, because there is little support for fetch-add or any other atomic arithmetic operations for floating point types in hardware in use today, the C++ standard doesn't provide the operators for them because they would have to be implemented inefficiently.

(edit). As an aside, std::atomic<double> in VS2015RC is lock-free.

L answered 5/5, 2015 at 10:25 Comment(1)
Btw, it looks like std::atomic<Floating> is now (C++20) required to have some support for fetch_add/fetch_sub.Catacaustic
G
9

The standard library mandates std::atomic<T> where T is any TriviallyCopyable type. Since double is TriviallyCopyable, std::atomic<double> should compile and work perfectly well.

If it does not, you have a faulty library.

Edit: since comment clarifying the question:

The c++ standard specifies specific specialisations for fundamental integral types. (i.e. types that contain integers that are required to be present in the language). These specialisations have further requirements to the general case of atomic, in that they must support:

  • fetch_add
  • fetch_sub
  • fetch_and
  • fetch_or
  • fetch_xor
  • operator++
  • operator--
  • comparison and assignment operators

OR, XOR, AND are of course not relevant for floating types and indeed even comparisons start to become tricky (because of the need to handle the epsilon). So it seems unreasonable to mandate that library maintainers make available specific specialisations when there is no case to support the demand.

There is of course nothing to prevent a library maintainer from providing this specialisation in the unlikely event that a given architecture supports the atomic exclusive-or of two doubles (it never will!).

Gland answered 5/5, 2015 at 9:15 Comment(5)
I should have been more clear. I'll expand my question a bit. But on the page I linked to in the question. A number of types are listed to have a complete specialization of std::atomic<>. Double isn't among these.Vaules
Atomic AND is relevant for fabs() (clearing the sign bit). XOR/OR can also usefully mess with the sign bit. Fun fact: a C++ standards proposal (open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0020r5.html) is planning to add fetch_add to atomic floating-point types, since apparently some hardware supports it efficiently. See also Atomic double floating point or SSE/AVX vector load/store on x86_64 for an asm perspective, and some inefficient compiler output :(Ceporah
Interesting. Thanks @PeterCordesGland
the link above has been deadBrightwork
@Brightwork yes I see. I'll remove itGland

© 2022 - 2024 — McMap. All rights reserved.