Performance of Interlocked.Increment
Asked Answered
H

6

21

Is Interlocked.Increment(ref x) faster or slower than x++ for ints and longs on various platforms?

Hephzibah answered 23/6, 2009 at 17:46 Comment(2)
As others point out, it's not the same thing. That said, according to msdn.microsoft.com/en-us/magazine/cc163726.aspx an Interlocked.Increment takes some 14nS (or about 71'000'000 per second) so I wouldn't worry to much about performanceIrate
Interlocked.Increment is intended to be used under threads environmentsButterfish
M
44

It is slower since it forces the action to occur atomically and it acts as a memory barrier, eliminating the processor's ability to re-order memory accesses around the instruction.

You should be using Interlocked.Increment when you want the action to be atomic on state that can be shared between threads - it's not intended to be a full replacement for x++.

Mirilla answered 23/6, 2009 at 17:53 Comment(0)
S
19

In our experience the InterlockedIncrement() et al on Windows are quite significant impacts. In one sample case we were able to eliminate the interlock and use ++/-- instead. This alone reduced run time from 140 seconds to 110 seconds. My analysis is that the interlock forces a memory roundtrip (otherwise how could other cores see it?). An L1 cache read/write is around 10 clock cycles, but a memory read/write more like 100.

In this sample case, I estimated the number of increment/decrement operations at about 1 billion. So on a 2Ghz CPU this is something like 5 seconds for the ++/--, and 50 seconds for the interlock. Spread the difference across several threads, and its close to 30 seconds.

Shon answered 19/8, 2009 at 18:6 Comment(2)
Micrsoft says: InterlockedIncrement was measured as taking 36-90 cycles in msdn.microsoft.com/en-us/library/windows/desktop/…Isodimorphism
36 sounds right for uncontested operation, I'm measuring about 120 times for heavily contested operation on a Core i7, but maybe I botched it? Anyway, "interlock forces a memory roundtrip (otherwise how could other cores see it?). An L1 cache read/write is around 10 clock cycles..." - it's enough to mark that page as changed and only flush from L1 to memory if another core needs to see it, so uncontested operation can be closer to the 10 end of the spectrum (at 36) rather than 100+....Synaeresis
D
7

Think about it for a moment, and you'll realize an Increment call cannot be any faster than a simple application of the increment operator. If it were, then the compiler's implementation of the increment operator would call Increment internally, and they'd perform the same.

But, as you can see by testing it for yourself, they don't perform the same.

The two options have different purposes. Use the increment operator generally. Use Increment when you need the operation to be atomic and you're sure all other users of that variable are also using interlocked operations. (If they're not all cooperating, then it doesn't really help.)

Discrepant answered 23/6, 2009 at 17:53 Comment(3)
No, it wouldn't - Interlocked.Increment cannot be called on a property, while the ++ operator can. Therefore, ++ wouldn't be able to call it.Hephzibah
To be more precise, Increment takes a ref int (or long); ++ takes a non-ref int (or long)Hephzibah
The compiler could certainly implement ++ via Increment. It wouldn't be implemented with a simple "call" instruction, but it could be done using a temporary introduced by the compiler. The point is that the compiler uses the fasted available method of incrementing a number; if there were something faster, the compiler would have used it instead.Discrepant
L
5

It's slower. However, it's the most performant general way I know of for achieving thread safety on scalar variables.

Leptospirosis answered 23/6, 2009 at 17:50 Comment(5)
volatile is more performant on scalars, but has the downside of requiring good coding practices to be used well.Vellicate
Careful with volatile; some processor architectures (x86/x64) have the ability to reorder accesses to memory, regardless of whether that memory was marked as volatile for the compiler.Leptospirosis
@DrewHoskins The behaviour is as the spec says, and the compiler is required to implement the spec correctly. A programmer writing for .Net never has to think about the memory-model of the underlying ISA (except perhaps for performance). If illegal reorderings somehow occur, that's a serious compiler bug.Etom
@Vellicate volatile with ++ is still not atomic / not thread-safe, while Interlocked.Increment (or a lock and ++) is thread-safe. Furthermore, volatile is unnecessary with either of the valid thread-safe methods..Pneumatophore
@user2864740, I don't remember why I wrote what I did 10 years ago, there may have been other comments at the time that are now deleted. But the volatile keyword should be used when you expect a variable to change anytime and locks aren't required or possible. I don't use the term thread safe here in my comment, but I think it's safe in the sense that you won't get dead locks, but it's tricky to use it right. You shouldn't use volatile when you want atomic read/write.Vellicate
R
3

It will always be slower because it has to perform a CPU bus lock vs just updating a register. However modern CPUs achieve near register performance so it's negligible even in real-time processing.

Rasheedarasher answered 23/6, 2009 at 17:49 Comment(1)
While X86 CPU's perform a buslock during Interlocked operations, a buslock is not required by all CPU's that provide Interlocked operations. Some CPU's are capable of signally that they have reserved a single cache line and can perform Interlocked operations on that cacheline without a buslock.Proparoxytone
R
2

My perfomance test:

volatile: 65,174,400

lock: 62,428,600

interlocked: 113,248,900

TimeSpan span = TimeSpan.FromSeconds(5);

object syncRoot = new object();
long test = long.MinValue;

Do(span, "volatile", () => {

    long r = Thread.VolatileRead(ref test);

    r++;

    Thread.VolatileWrite(ref test, r);
});

Do(span, "lock", () =>
{
    lock (syncRoot)
    {
        test++;
    }
});

Do(span, "interlocked", () =>
{
    Interlocked.Increment(ref test);
});
Roentgenotherapy answered 13/4, 2012 at 13:18 Comment(5)
It runs the method n times until the timespan has been reachedRoentgenotherapy
So, wait, more is better in this case? Could you please specify what your metrics are?Nesline
Doesn't using VolatileRead and VolatileWrite in this manner leave you open to the same race condition that lock and Interlocked.Increment avoid?Bravo
Your Volatile example doesn't thread safe.Digiovanni
Of course it is faster than a full-on lock. I do not see where the OP ever discussed multi-threading though. Obviously, that's the purpose of interlocked instructions, but as written, no... an atomic, cache coherent, increment will never be faster than ++. Also as stated above, the use of volatile here is questionable. It enforces coherency, but the read, increment, write behavior is not atomic and you will clobber values written by other threads accessing the same counter simultaneously. The lock codepath is also almost certainly constructed from one or more interlocked ops.Sharpsighted

© 2022 - 2024 — McMap. All rights reserved.