Why is there no overload of Interlocked.Add that accepts Doubles as parameters?

Asked 9/9, 2009 at 15:41 Answered 3/6, 2013 at 9:27

I fully appreciate the atomicity that the Threading.Interlocked class provides; I don't understand, though, why the Add function only offers two overloads: one for Integers, another for Longs. Why not Doubles, or any other numeric type for that matter?

Clearly, the intended method for changing a Double is CompareExchange; I am GUESSING this is because modifying a Double is a more complex operation than modifying an Integer. Still it isn't clear to me why, if CompareExchange and Add can both accept Integers, they can't also both accept Doubles.

Modernism answered 9/9, 2009 at 15:41 Comment(0)

The Interlocked class wraps around the Windows API Interlocked** functions.

These are, in turn, wrapping around the native processor API, using the LOCK instruction prefix for x86. It only supports prefixing the following instructions:

BT, BTS, BTR, BTC, XCHG, XADD, ADD, OR, ADC, SBB, AND, SUB, XOR, NOT, NEG, INC, DEC

You'll note that these, in turn, pretty much map to the interlocked methods. Unfortunately, the ADD functions for non-integer types are not supported here. Add for 64bit longs is supported on 64bit platforms.

Here's a great article discussing lock semantics on the instruction level.

Varus answered 9/9, 2009 at 15:58 Comment(3)

The link is not active anymore. Here's a backup from Internet Archive web.archive.org/web/20160319061137/http://www.codemaestro.com/… – Eskimo 23/8, 2017 at 10:25

You forgot lock cmpxchg and lock cmpxchg8b. Maybe you're looking at the manual from an ancient 386, before that flexible building-block was added to the ISA? felixcloutier.com/x86/LOCK.html is from a modern manual. It can emulate any RMW operation, including std::atomic<double> addition; C++20 even added a fetch_add() member function for FP templates, not just integer. See Atomic double floating point or SSE/AVX vector load/store on x86_64 and preshing.com/20150402/… – Henceforth 16/8, 2022 at 14:37

(It still makes sense that there are no wrapper functions for FP atomic RMWs, since that's not something x86 can do in a single instruction.) – Henceforth 16/8, 2022 at 14:38

Others have addressed the "why?". It is easy however to roll your own Add(ref double, double), using the CompareExchange primitive:

public static double Add(ref double location1, double value)
{
    double newCurrentValue = location1; // non-volatile read, so may be stale
    while (true)
    {
        double currentValue = newCurrentValue;
        double newValue = currentValue + value;
        newCurrentValue = Interlocked.CompareExchange(ref location1, newValue, currentValue);
        if (newCurrentValue.Equals(currentValue)) // see "Update" below
            return newValue;
    }
}

CompareExchange sets the value of location1 to be newValue, if the current value equals currentValue. As it does so in an atomic, thread-safe way, we can rely on it alone without resorting to locks.

Why the while (true) loop? Loops like this are standard when implementing optimistically concurrent algorithms. CompareExchange will not change location1 if the current value is different from currentValue. I initialized currentValue to location1 - doing a non-volatile read (which might be stale, but that does not change the correctness, as CompareExchange will check the value). If the current value (still) is what we had read from location1, CompareExchange will change the value to newValue. If not, we have to retry CompareExchange with the new current value, as returned by CompareExchange.

If the value is changed by another thread until the time of our next CompareExchange again, it will fail again, necessitating another retry - and this can in theory go on forever, hence the loop. Unless you are constantly changing the value from multiple threads, CompareExchange will most likely be called only once, if the current value is still what the non-volatile read of location1 yielded, or twice, if it was different.

Update 2022/8/17

As Dr. Strangelove and Theodor Zoulias pointed out in the comments, when location1 == Double.NaN, Add() would turn into an infinite loop.

So I had to change

if (newCurrentValue == currentValue)

if (newCurrentValue.Equals(currentValue))

Scenario answered 3/6, 2013 at 9:27 Comment(20)

Can you explain a bit more how this works? Why do you loop? And what's the role of CompareExchange? – Maletta 1/7, 2013 at 6:28

This is now an excellent answer! At least now I think I know what optimistic concurrency means: 'Writing concurrent code with the assumption that things will eventually go right.' In case of our while(true) this means we will not loop forever (we're being reasonably optimistic). Thanks a lot! – Maletta 7/2, 2014 at 15:18

@Mzn pessimistic concurrency: let's take a lock around multiple operations, because it could go wrong with multiple steps. Reliable, and a reliable bottleneck for concurrent code. optimistic concurrency: let's give it a lockless try using CompareExchange, because most of the time it will just work with multiple threads, andf if not, we'll just retry. This allows for more concurrency, and is almost always faster. It's not easy to program, and some things just can't be done without locks. – Scenario 17/4, 2014 at 1:2

Could you explain why you set newCurrentValue = 0 rather than setting newCurrentValue = location1? Wouldn't the latter avoid the first loop iteration if the value is != 0? Thanks – Chloe 5/11, 2015 at 11:43

@Chloe newCurrentValue = location1 is not thread-safe, as it does not insert a memory barrier. Since .Net 4.5 you could do newCurrentValue = Volatile.Read(ref location1), but the read memory barrier incurs a performance penalty, so for the case where the current value is 0, you will have one memory barrier for the volatile read and one for CompareExchange. A volatile read however should be - not sure about .Net though - faster than CompareExchange, so for current values other than 0 using a volatile read would be faster. – Scenario 5/11, 2015 at 23:34

@Chloe ctd So it is a tradeoff, although the initial volatile read will probably make it, if only slightly, faster for almost all cases. The signature of the method that I use in practice however, for cases where I know the current value with high probability from a previous write is this: double Add(ref double location1, double value, double currentValue = 0). – Scenario 5/11, 2015 at 23:39

@Chloe Thinking about the non-volatile assignment "is not thread-safe" again - it does not really matter here, as in those cases where we do read a wrong or stale value (e.g. from the L1 cache), the worst case is that the while loop gets iterated one more time. So maybe what you suggest is in fact a good optimization that will work most of the time - certainly more often than the constant 0 assignment - without the use of a memory barrier. Will mull over it some more and probably adapt my answer accordingly. For the time being, I'll keep the comments, as they reflect the train of thought – Scenario 5/11, 2015 at 23:48

I was thinking along similar lines to your last comment, in that even if it were a stale value, it is more likely to be correct than 0 and could save an iteration. Let me know what you think too. Thanks – Chloe 6/11, 2015 at 17:55

@Chloe Absolutely! Updated my answer. – Scenario 24/2, 2017 at 1:19

I guess this has a corner case where you end-up in an infinite loop: how best do you address the case where location1 is NaN? – Bellew 28/4, 2022 at 18:0

@Dr.Strangelove I guess that you could use the Equals method instead of the ==: if (newCurrentValue.Equals(currentValue)). -- "If two Double.NaN values are tested for equality by calling the Equals method, the method returns true." – Petiole 16/8, 2022 at 14:53

Note that theoretically a non-volatile read of a double value can be torn. Also the Double.NaN value has multiple valid binary representations. So it's theoretically possible for a torn double value to be equal to Double.NaN. I can't think of a problematic scenario, but if you want to be on the safe side prefer newCurrentValue = Volatile.Read(ref location1). – Petiole 16/8, 2022 at 16:38

@TheodorZoulias If you are talking about the first assignment to newCurrentValue, before entering the loop: Worst case here is that the loop would need to run one iteration more than strictly necessary - but costing an additional volatile read on every single call. If you are talking about the assignment inside the loop: The Interlocked docs state the class provides atomic operations for variables that are shared by multiple threads. So there's no need for Volatile.Read there either. – Scenario 16/8, 2022 at 22:36

@EvgeniyBerezovsky I am talking about the first assignment, before entering the loop. I am not 100% sure about the worst case scenario. The problem is that it's possible to create equal double values with different underlying byte[] data. There is not a one-to-one relationship between bytes and value. This is true:

BitConverter.ToDouble(new byte[] { 0, 0, 0, 0, 0, 0, 0xF8, 0xFF }, 0).Equals(BitConverter.ToDouble(new byte[] { 1, 0, 0, 0, 0, 0, 0xF8, 0xFF }, 0))

. – Petiole 16/8, 2022 at 23:58

@Dr.Strangelove (Thanks for elaborating, Theodor). I just tested Dr. Strangelove's assertion, and he's right: double d1 = Double.NaN; Add(ref d1, 2); spins forever. As Theodor suggested, using Equals() instead of == fixes the problem. So it looks like CompareExchange does not just copy newValue to location1 and return it as is, but newValue and newCurrentValue will end up being !=, but Equals(). I have changed my answer accordingly. – Scenario 17/8, 2022 at 2:10

@EvgeniyBerezovsky: In IEEE FP semantics, NaN == NaN is always false, even if they have identical bit-patterns. If your function can only return by an == comparison on doubles producing true, that's always a problem. Nothing to do with CompareExchange changing bit-patterns, just the fact that Interlocked.CompareExchange doesn't return a separate bool so you have to synthesize that from the one output it does produce. (Unlike C++11 compare_exchange_weak/strong which update the "expected" arg by reference on CAS failure, and return a bool status.) – Henceforth 17/8, 2022 at 3:12

@PeterCordes Thanks - I finally understand what's going on here! – Scenario 17/8, 2022 at 3:23

Thank you all for elaborating on this and suggesting a fix. – Bellew 17/8, 2022 at 22:55

@EvgeniyBerezovsky here is the worst case scenario. Imagine that some smart devs have decided to use the Double.NaN representation redundancy to store extra bits of information inside it. There are 50 exploitable redundant bits in a Double.NaN, so it's not unreasonable. A non-volatile torn read of the location1 may result in corrupted data stored in the location1, after the completion of the Add. Corrupted for the purposes of the smart devs of course, who don't regard all Double.NaN representations as equal. – Petiole 18/8, 2022 at 6:54

@EvgeniyBerezovsky apart from doing an initial Volatile.Read of the location1 (which is atomic, no risk of reading a torn value), this problem might also be solvable by comparing the bits of the two double values, instead of relying on the native APIs provided by the Double struct (== and Equals). For example: BitConverter.DoubleToInt64Bits(newCurrentValue) == BitConverter.DoubleToInt64Bits(currentValue). – Petiole 18/8, 2022 at 7:1

The Interlocked class wraps around the Windows API Interlocked** functions.

These are, in turn, wrapping around the native processor API, using the LOCK instruction prefix for x86. It only supports prefixing the following instructions:

BT, BTS, BTR, BTC, XCHG, XADD, ADD, OR, ADC, SBB, AND, SUB, XOR, NOT, NEG, INC, DEC

Here's a great article discussing lock semantics on the instruction level.

Varus answered 9/9, 2009 at 15:58 Comment(3)

The link is not active anymore. Here's a backup from Internet Archive web.archive.org/web/20160319061137/http://www.codemaestro.com/… – Eskimo 23/8, 2017 at 10:25

(It still makes sense that there are no wrapper functions for FP atomic RMWs, since that's not something x86 can do in a single instruction.) – Henceforth 16/8, 2022 at 14:38

As Reed Copsey has said, the Interlocked operations map (via Windows API functions) to instructions supported directly by the x86/x64 processors. Given that one of those functions is XCHG, you can do an atomic XCHG operation without really caring what the bits at the target location represent. In other words, the code can "pretend" that the 64-bit floating point number you are exchanging is in fact a 64-bit integer, and the XCHG instruction won't know the difference. Thus, .Net can provide Interlocked.Exchange functions for floats and doubles by "pretending" that they are integers and long integers, respectively.

However, all of the other operations actually do operate on the individual bits of the destination, and so they won't work unless the values actually represent integers (or bit arrays in some cases.)

Garton answered 27/6, 2011 at 23:37 Comment(1)

A lock cmpxchg retry loop can synthesize any atomic RMW operation you want, as long as you can do a wide enough CAS. It makes sense that there isn't an Interlocked wrapper function for FP add, but that doesn't stop you from implementing the operation yourself. en.wikipedia.org/wiki/… / preshing.com/20150402/… . See Atomic double floating point or SSE/AVX vector load/store on x86_64 for compiler-generated asm from C++11 atomic<double> – Henceforth 16/8, 2022 at 14:31

I suspect that there are two reasons.

The processors targeted by .NET support interlocked increment only for integer types. I believe this is the LOCK prefix on x86, probably similar instructions exist for other processors.
Adding one to a floating point number can result in the same number if it is big enough, so I'm not sure if you can call that an increment. Perhaps the framework designers are trying to avoid nonintuitive behavior in this case.

Vitals answered 9/9, 2009 at 15:52 Comment(1)

Regarding your second point: yes, but I am asking about Interlocked.Add, not Interlocked.Increment. – Modernism 9/9, 2009 at 15:56

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags