How to use x64 Interlocked Operations against MemoryMappedFiles in .net
Asked Answered
J

1

7

I need to use Interlocked Operations (CompareExchange, Increment etc.) against memory in MemoryMappedFiles in .NET.

I found this answer to a very similar question. The problem is that Interlocked Operations are not exported from kernel32 (or any other) dll on 64 bit OS (see e.g. http://blog.kalmbachnet.de/?postid=46).

Is there any other way how I can call Interlocked functions on a block of memory in a 64bit .NET process?

Juniorjuniority answered 23/9, 2014 at 10:27 Comment(5)
I would try to write my own C Dll with exported functions calling interlocked functions, and PInvoke it from .NET.Dreamy
@AlexFarber Excelent point! I was just goint to ask about this:) Do you happen to know if I can easily find out ASM implementation of compiler intrinsic Interlocked functions (e.g. http://msdn.microsoft.com/en-us/library/2ddez55b(v=vs.80).aspx)? So that I do not have to reinvent the ASM code myselfJuniorjuniority
You don't need to do this, just call required functions from native Dll, compiler will do the rest. I mean, for each interlocked function that you need, write exported Dll function that calls Interlocked function.Dreamy
The point of using this kind of atomic access function is to get it inlined so there is absolutely minimum overhead. Once you have to pinvoke then that point is entirely lost, there's just no point left in avoiding a named sync object.Poler
@HansPassant In my case I share a memory buffer with hundreds of long values. I would need hundreds of sync objects (mutex etc.) in order to avoid contention. Also I was hoping for Interlocked as those force push the new value through memory cache lines (as opposed to memory barriers that have unspecific timing) - but you are right that P/Invokes would likely completely crash this benefit :|Juniorjuniority
S
1

Write yourself a small C++/CLI helper library that provides interlocked operations consumable by managed code.

I believe the fastest interop path would be to expose a managed class that internally calls into an unmanaged function which itself makes use on the interlocked intrinsics. That way you don't even have to go through PInvoke.

Surfacetosurface answered 16/10, 2014 at 10:43 Comment(3)
Unfortunately this is not true - C++/CLI is slower than P/Invoke with suppressed checks - see e.g. here: codeproject.com/Articles/253444/… or here: xinterop.com/index.php/2013/05/01/… So P/Invoke is the way to go (unfortunately it still ads over a dozen of instructions per each call)Juniorjuniority
The first article seems to show that the C++ wrapper is faster. In the 2nd article the C++ wrapper is so much slower that I start to become suspicious. Maybe optimizations were not turned on or extra work was performed (indeed - the C++ wrapper calls sqrt only through an intermediate class. why?). In both articles the benchmark times were very small. A lot of noise. DateTime.Now is not very precise either. Normally, it increases in 15ms steps. His tests were in the 10-30ms range. I don't trust either article and I will not invest the time to investigate more.Surfacetosurface
I agree with your findings. The main point is that if you want maximum speed you need to supers all stack trace probing etc. associated with managed<->native transitions. with P/Invokes you can do this by specifying SuppressUnmanagedCodeSecurity attribute. With C++/CLI wrapper you get all those check by default and to my knowledge you cannot shortcut them.Juniorjuniority

© 2022 - 2024 — McMap. All rights reserved.