How to call a CPU instruction from C#?
Asked Answered
S

1

11

My processor (Intel i7) supports the POPCNT instruction and I would like to call it from my C# application. Is this possible?

I believe I read somewhere that it isn't, but the JIT will invoke it if it finds it available but what function would I have to call that may be substituted with such an instruction?

Popcount is being called millions of times in a loop so I'd like to be able to have this CPU optimization if possible.

Scalawag answered 13/3, 2015 at 19:34 Comment(8)
Is C# the right language for this? I thought we used languages like C# so we don't have to think (that hard) about CPU instructions.Grandpa
No, it is not. However, I prefer working with C#.Scalawag
This question has been asked and answered on StackOverFlow. [1]: #6098135Caulicle
@KyleWilliamson that question is about how to determine if the CPU supports the instruction, not how to call it.Donner
"How do I hammer in this nail with a screwdriver? I know it's the wrong tool, but I hate hammers and lover screwdrivers." If you need to do this then you need to use a different language. If that is not obvious to you then I'm afraid you will likely mess up the implementation anyway.Tabbatha
Ah, you are correct. Sorry about that... The post still may be relevant for it says that C# can't do CPU level optimizations. "The JIT compiler in the common language runtime is able to do some optimization when the code is actually run, but there is no direct access to that process from the language itself."Caulicle
This question also has some related information. Another way is writing the bottle neck part in unmanaged C++Dissertate
Do you want to use the instruction yourself? or do you want .Net to use SSE 4 for optimizations?Ailsun
T
15

You want to play with fire, and here we like to play with fire...

class Program
{
    const uint PAGE_EXECUTE_READWRITE = 0x40;
    const uint MEM_COMMIT = 0x1000;

    [DllImport("kernel32.dll", SetLastError = true)]
    static extern IntPtr VirtualAlloc(IntPtr lpAddress, IntPtr dwSize, uint flAllocationType, uint flProtect);

    private delegate int IntReturner();

    static void Main(string[] args)
    {
        List<byte> bodyBuilder = new List<byte>();
        bodyBuilder.Add(0xb8); // MOV EAX,
        bodyBuilder.AddRange(BitConverter.GetBytes(42)); // 42
        bodyBuilder.Add(0xc3);  // RET
        byte[] body = bodyBuilder.ToArray();
        IntPtr buf = VirtualAlloc(IntPtr.Zero, (IntPtr)body.Length, MEM_COMMIT, PAGE_EXECUTE_READWRITE);
        Marshal.Copy(body, 0, buf, body.Length);

        IntReturner ptr = (IntReturner)Marshal.GetDelegateForFunctionPointer(buf, typeof(IntReturner));
        Console.WriteLine(ptr());
    }
}

(this small example of assembly will simply return 42... I think it's the perfect number for this answer :-) )

In the end the trick is that:

A) You must know the opcodes corresponding to the asm you want to write

B) You use VirtualAlloc to make a page of memory executable

C) In some way you copy your opcodes there

(the code was taken from http://www.cnblogs.com/netact/archive/2013/01/10/2855448.html)

Ok... the other one was as written on the site (minus an error on the uint -> IntPtr dwSize), this one is how it should be written (or at least it's a +1 compared to the original... I would encapsulate everything in a IDisposable class instead of using try... finally)

class Program
{
    const uint PAGE_READWRITE = 0x04;
    const uint PAGE_EXECUTE = 0x10;
    const uint MEM_COMMIT = 0x1000;
    const uint MEM_RELEASE = 0x8000;

    [DllImport("kernel32.dll", SetLastError = true)]
    static extern IntPtr VirtualAlloc(IntPtr lpAddress, IntPtr dwSize, uint flAllocationType, uint flProtect);

    [DllImport("kernel32.dll", SetLastError = true)]
    [return: MarshalAs(UnmanagedType.Bool)]
    static extern bool VirtualProtect(IntPtr lpAddress, IntPtr dwSize, uint flAllocationType, out uint lpflOldProtect);

    [DllImport("kernel32.dll", SetLastError = true)]
    [return: MarshalAs(UnmanagedType.Bool)]
    static extern bool VirtualFree(IntPtr lpAddress, IntPtr dwSize, uint dwFreeType);

    private delegate int IntReturner();

    static void Main(string[] args)
    {
        List<byte> bodyBuilder = new List<byte>();
        bodyBuilder.Add(0xb8); // MOV EAX,
        bodyBuilder.AddRange(BitConverter.GetBytes(42)); // 42
        bodyBuilder.Add(0xc3);  // RET

        byte[] body = bodyBuilder.ToArray();

        IntPtr buf = IntPtr.Zero;

        try
        {
            // We VirtualAlloc body.Length bytes, with R/W access
            // Note that from what I've read, MEM_RESERVE is useless
            // if the first parameter is IntPtr.Zero
            buf = VirtualAlloc(IntPtr.Zero, (IntPtr)body.Length, MEM_COMMIT, PAGE_READWRITE);

            if (buf == IntPtr.Zero)
            {
                throw new Win32Exception();
            }

            // Copy our instructions in the buf
            Marshal.Copy(body, 0, buf, body.Length);

            // Change the access of the allocated memory from R/W to Execute
            uint oldProtection;
            bool result = VirtualProtect(buf, (IntPtr)body.Length, PAGE_EXECUTE, out oldProtection);

            if (!result)
            {
                throw new Win32Exception();
            }

            // Create a delegate to the "function"
            // Sadly we can't use Funct<int>
            var fun = (IntReturner)Marshal.GetDelegateForFunctionPointer(buf, typeof(IntReturner));

            Console.WriteLine(fun());
        }
        finally
        {
            if (buf != IntPtr.Zero)
            {
                // Free the allocated memory
                bool result = VirtualFree(buf, IntPtr.Zero, MEM_RELEASE);

                if (!result)
                {
                    throw new Win32Exception();
                }
            }
        }
    }
}
Teodoro answered 13/3, 2015 at 19:51 Comment(8)
Better to call VirtualProtect after the copy, to add the X bit and remove W. Since enforcing W^X seems to be good for security.Mushroom
@BenVoigt I preferred to copy verbatim the example of code... But yes, it's normally better to do as you said.Teodoro
popcnt eax, [esp + 4] would be F3 0F B8 44 24 04 by the way, so you can throw that in. F3 0F B8 C1 for popcnt eax, ecx (for win64 calling conventions)Nichrome
@BenVoigt Now that I've used some try... finally and the VirtualProtect I feel more... clean :)Teodoro
Why is the IntReturner needed?Azzieb
@Chris You need a delegate to "point" to the asm function. And it can't be one of the Func<> or Action<> because GetDelegateForFunctionPointer doesn't like themTeodoro
@xanatos: Yeah, but is there an actual reason why that wouldn't be possible technically if someone decided to make GetDelegateForFunctionPointer to like them? Seems like quite useful functionality.Azzieb
@ChrisEelmaa I don't know why at Microsoft they decided that supporting generic delegates with GetDelegateForFunctionPointer was too much complex.Teodoro

© 2022 - 2024 — McMap. All rights reserved.