Atomically exchange value on result of comparison
Asked Answered
G

4

7

I have a very simple operation that needs to be done atomically:

if (a > b)
  b = a

where a and b are ints

EDIT: and a is local.

Is there a fast way to do this in C#? I'd like to avoid locking manually if possible. I've looked at Interlocked.CompareExchange, but as I understand it, this only tests for equality.

Thanks!

Globigerina answered 17/8, 2011 at 22:13 Comment(5)
Obligatory Old New Thing link: blogs.msdn.com/b/oldnewthing/archive/2004/09/15/229915.aspxRune
@vcsjones: Interesting article, thanks for that. However, in my situation, b is only ever written to in the above circumstance. I think the Interlocked-type approach is appropriate here.Globigerina
@dcrooney - Interlocked may be appropriate, and you'll see that the solution used to solve it (CompareExchange in a loop) is Henning Makholm's answer.Rune
similar to #3731588Zarate
@Brian Gideon : looks like not quite similar, here is we have one local variable without multiple threads accessing itVisualize
S
9

The canonical way is to use interlocked compare-exchange in a loop:

int oldvalue, newvalue ;
do {
  oldvalue = b ; // you'll want to force this to be a volatile read somehow
  if( a > oldvalue )
    newvalue = a ;
  else
    break ;
} while( interlocked replace oldvalue with newvalue in b does NOT succeed );

(Pseudocode because I don't bother to look up the correct way to do an interlocked exchange in C#).

As you see, unless you have overriding efficiency concerns, using plain old mutexes is far simpler and more readable.

Edit: This assumes that a is a local variable or at least not subject to asynchronous writes. It both of a and b can be modified behind your back, then there is no lock-free way of doing this update atomically. (Thanks to silev for pointing this out).

Sophia answered 17/8, 2011 at 22:19 Comment(13)
But this would be manual locking and a lot of stuff with loops, overkill, what you think?Visualize
I understood "avoid locking manually" as meaning that he wanted not to use the lock keyword or explicit mutexes. If that's the boundary condition (and I'm not really endorsing it, as should be clear), then using a loop is inevitable, at least with the x86 locking model.Sophia
Just imagine THREAD#1: read value of 'a' let's say 10 after line if( a > oldvalue ) ==> thread switch ==> THREAD#2: assign -1 to 'a' ==> thread switch ==> THREAD#1: trying to assign 'a' to newValue == result will be -1 but originally THREAD#1 readed 10Visualize
@sllev: Doing a few loop iterations (even 100) is faster than waiting on a critical section, or even just trying to enter the critical section.Rune
@Rune : absolutely (I understand what is under the hood of lock()), but is there an other option?Visualize
Thanks much. I understand that a simple lock statement would be easier to implement, but efficiency is a major concern in this situation. Looks like this is the way to go. :)Globigerina
Can anyone describe step by step how both statements "if( a > oldvalue ) newvalue = a ;" will be executed in sindle atomic operation without race condition?Visualize
@Henning Makholm : I believe there are a lot of assumptions so this would not be true atomic operation, this by nature CAN NOT BE, and as result solution is not clear and moreover - 2 downvotes for me, damn :)Visualize
@sllev - That as a whole isn't an atomic operation, what is going on is we do the operation, get the value; and atomically compare the old/new values to their originals. If they have changed; then let the loop try again. If they haven't changed, exchange the computed value as an atomic operation. You don't care about the whole operation being atomic; all you care about is atomically checking if anything changed after your computation is done and exchange it if not.Rune
@Henning Makholm, sllev: a is local. Updated original post to reflect this.Globigerina
@Rune : OP said that this is single operation and you translated it to loop and saying that this is atomic, I do not agree with such approach. Atomic mean "is one which is always observed to be done or not done, but never halfway done", in your case a lot of things could happen whilst execution of loop even EPIC Fail. ORIGINAL QUESTIOn - "very simple operation" - 1 operation, "atomically" - should not be divided by multiple steps anyway..... so I would stay that this is not atomic operation at wholeVisualize
@vcsjones: As I understand it, and the last time I looked, the Windows Critical Section is a spin-lock with a bounded number of loops. It should be nearly as fast as doing this compare-exchange operation unless contention is so high that the operation has to go into the kernel.Astroid
@Zan, but if we are talking micro-optimizations, updating via a spinlock would entail three writes to shared memory even on the fast path: once to acquire the lock, once to update b and yet once to release the lock. All other things being equal, that puts a bigger load on the cache coherency protocol. (Not that I'm sure which protocols are used by contemporary hardware).Sophia
Z
4

Henning is correct. I will provide the details as they pertain to C#. The pattern can be generalized with the following function.

public static T InterlockedOperation<T>(ref T location, T operand)
{
  T initial, computed;
  do
  {
    initial = location;
    computed = op(initial, operand); // where op is replaced with a specific implementation
  } 
  while (Interlocked.CompareExchange(ref location, computed, initial) != initial);
  return computed;
}

In your specific case we can define an InterlockedGreaterThanExchange function like this.

public static int InterlockedGreaterThanExchange(ref int location, int value)
{
  int initial, computed;
  do
  {
    initial = location;
    computed = value > initial ? value : initial;
  } 
  while (Interlocked.CompareExchange(ref location, computed, initial) != initial);
  return computed;
}
Zarate answered 18/8, 2011 at 0:43 Comment(4)
MemoryBarrier is required only on multiprocessor systems with weak memory ordering (for example, a system employing multiple Intel Itanium processors). For most purposes, the C# lock statement, the Visual Basic SyncLock statement, or the Monitor class provide easier ways to synchronize data. msdn.microsoft.com/en-us/library/…Visualize
So you believe it would be simpler and faster rather than lock() ?Visualize
The memory barrier is definitely needed. Its the call to Thread.MemoryBarrier which can be omitted because Interlocked.CompareExchange already generates one which prevents the read of location from getting lifted (by optimization) outside of the loop. You are right though, the x86 hardware has a tight guarantees, but the CLI (via the ECMA spec) has weak guarantees which means the optimization could take place at the JIT level. You have to code for the weaker model. That is why Thread.MemoryBarrier and the like exist in the first place.Zarate
I think a plain old lock would be simpler and that is why I upvoted your answer, but a loop with a CAS operation would probably be faster especially in highly concurrent systems with many cores.Zarate
V
0

There is no such atomic operation to execute both comparision and assignment in true atomic manner. In C# true atomic operations could be Interlocked operations like CompareExchenge() and Exchange() to exchange two integer values (32bit for 32bit architecture and 64-bit for 64 bit processor architecture).

Obviously simple assignment a = b is a two operations - read and write. So there is no possibility to do comparision + assignment in one atomic step... Does it possible in any other language/architecture, I mean true atomic? ;)

EDIT: Original question does not indicated that 'a' is local variable... uhhhggg such a small correction... any way I'll leave my answer as is because I believe in nature of true atomic operations and not in some kind of 'tricks' which has ambitions to be atomic

So to summarize:

  • we can not do such operations as single atomic operation
  • only one way is to syncronize it using locking mechanism (assuming original question which does not states that a is local, so both a and b potentially could be accessed by an other thread
object lockObject = new object();
int a = 10;
int b = 5;

lock (lockObject)
{
   if (a > b)
   {
      b = a
   }
}
Visualize answered 17/8, 2011 at 22:18 Comment(1)
I'm neutral on this answer since it is correct, and although opinionated it gives a not-bad reason for it. But I hate to see it getting negative votes, so upvoting to 0.Brominate
B
0

These are a couple of optimized "recipes" I came up with after reading other answers. Not directly related to the question, but adding here since this is where related searches landed. Tried to keep this to a > b or a >= b to match the original question. And to keep them general so as not to bias the descriptions away from other related problems. Both could be wrapped into methods.

Optimization - Monotonic b

If this is the only operation that is being done to b or b is otherwise monotonically increasing, you can optimize away the interlocked assignment and the retries where a > b == false:

int initial;
do 
{
  initial = b;
}
while ((a > initial) && Interlocked.CompareExchange(ref b, a, initial) != initial);

If a > b == false it won't be true after any retry (b is only getting larger) and the exchange can be skipped since b would be unaffected.

Related problem optimization: Loose throttling

Action signal() should only be called once each time a local variable a (e.g., a system time sample) has increased by some constant threshold >= 1 since the last time signal() was called. Here, b is a threshold placeholder compared to a and set to a + threshold if a >= b. Only the "winning" thread should call signal().

var initial = b;
if ((a > initial) && Interlocked.CompareExchange(ref b, a + threshold, initial) == initial)
{
  signal();
}

Here, b is again monotonic, but now we can optimize away the retry loop entirely since we only want to know if the local thread "won the race" with other threads while respecting threshold. This method will effectively "block" any other thread from signalling within the defined threshold.

Throttling Caveats

This one will not work if you need a "strict" throttle--that is, the "smallest" a where a >= b--thus the "loose" in the title. This will also not work if you need to report only the last in a series of closely grouped as--what might be called "debouncing", related but distinct problem.

Brominate answered 20/8, 2018 at 19:31 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.