Array of pointers in C++/CLI MSIL assembly
Asked Answered
A

1

8

I'm trying to wrap some legacy C code for use with C# running on .NET Core. I'm using the approach given here to create a C++ wrapper that compiles to pure MSIL. It's working well for simple functions, but I've found that if my code ever uses pointers-to-pointers or arrays of pointers it will crash with a memory violation. Often it crashes Visual Studio and I have to restart everything, which is tedious.

For example, the following code will cause the crashes:

public ref class example
    {
    public:

        static void test() {
            Console::WriteLine("\nTesting pointers.");

            double a[5] = {5,6,7,8,9}; //Array.
            double *b = a; //Pointer to first element in array.

            Console::WriteLine("\nTesting bare pointers.");
            Console::WriteLine(a[0]); //Prints 5.
            Console::WriteLine(b[0]); //Prints 5.

            Console::WriteLine("\nTesting pointer-to-pointer.");
            double **c = &b;
            Console::WriteLine(c == &b); //Prints true.
            Console::WriteLine(b[0]); //Works, prints 5.
            Console::WriteLine(**c); //Crashes with memory access violation.

            Console::WriteLine("\nTesting array of pointers.");
            double* d[1];
            d[0] = b;
            Console::WriteLine(d[0] == b); //Prints false???
            Console::WriteLine(b[0]); //Works, prints 5.
            Console::WriteLine(d[0][0]); //Crashes with memory access violation.

            Console::WriteLine("\nTesting CLI array of pointers.");
            cli::array<double*> ^e = gcnew cli::array<double*> (5);
            e[0] = b;
            Console::WriteLine(e[0] == b); //Prints false???
            Console::WriteLine(b[0]); //Works, prints 5.
            Console::WriteLine(e[0][0]); //Crashes with memory access violation.
        }
}

Note that simply using pointers doesn't cause any problem. It's only when there is that extra level of indirection.

If I drop the code in a CLR C++ console app, it works exactly as expected and doesn't crash. The crash only happens when compiling the code into an MSIL assembly with clr:pure and running from a .NET core app.

What could be going on?

Update 1: Here are the Visual Studio files: https://app.box.com/s/xejfm4s46r9hs0inted2kzhkh9qzmjpb It's two projects. The MSIL assembly is called library and the CoreApp is a C# console app that'll call the library. Warning, it's likely to crash Visual Studio when you run it.

Update 2: I noticed this too:

        double a[5] = { 5,6,7,8,9 };
        double* d[1];
        d[0] = a;
        Console::WriteLine(d[0] == a); //Prints true.
        Console::WriteLine(IntPtr(a)); //Prints a number.
        Console::WriteLine(IntPtr(d[0])); //Prints a completely different number.
Aegeus answered 4/7, 2018 at 21:57 Comment(5)
PITA to build, but works just fine, as expected. This just isn't special at all.Comprehend
@HansPassant Did you build it as a pure MSIL assembly? And then call from a .NET Core app? Those are the only conditions it crashes under. It won't crash if built as or ran from a desktop (.NET Framework) app.Aegeus
github.com/mono/CppSharpIngeringersoll
Of course. If you want somebody to look at your solution then you'll have to publish it somewhere.Comprehend
@HansPassant Here is the code. It has two projects. "Library" is the above code in C++ CLI, configured to compile as pure MSIL. CoreApp is a C# console app that'll call the library. I've tried it on a couple computers. It crashes and it usually takes Visual Studio with it. app.box.com/s/xejfm4s46r9hs0inted2kzhkh9qzmjpb Thanks for looking!Aegeus
M
4

This looks like an issue in the generated IL for the test method. At the point of the crash we're reading **c, and c is local number 5.

IL_00a5  11 05             ldloc.s      0x5
IL_00a7  4a                ldind.i4    
IL_00a8  4f                ldind.r8    
IL_00a9  28 11 00 00 0a    call         0xA000011

So here we see the IL says to load the value of c, then load a 4 byte signed integer, then treat that integer as a pointer and load an 8 byte real type (double).

On a 64 bit platform pointers should be either size-neutral or 64 bits. So the ldind.i4 is problematic as the underlying address is 8 bytes. And since the IL specifies reading only 4 bytes, the jit must extend the result to get an 8 byte value. Here it chooses to sign extend.

library.h @ 27:
00007ffd`b0cf2119 488b45a8        mov     rax,qword ptr [rbp-58h]
00007ffd`b0cf211d 8b00            mov     eax,dword ptr [rax]
00007ffd`b0cf211f 4863c0          movsxd  rax,eax   // **** sign extend ****
>>> 00007ffd`b0cf2122 c4e17b1000      vmovsd  xmm0,qword ptr [rax]
00007ffd`b0cf2127 e854f6ffff      call    System.Console.WriteLine(Double) (00007ffd`b0cf1780)

You apparently get lucky when running on full framework as the array address is small and fits in 31 bits or less, so reading 4 bytes and then sign-exending to 8 bytes still gives the right address. But on Core it doesn't and so that's why the app crashes there.

It appears you generated your library using the Win32 target. If you rebuild it with an x64 target the IL will use a 64 bit load for *c:

IL_00ab:  ldloc.s    V_5
IL_00ad:  ldind.i8
IL_00ae:  ldind.r8
IL_00af:  call       void [mscorlib]System.Console::WriteLine(float64)

and the app runs just fine.

It appears this is a feature in C++/CLI -- the binaries it produces are implicitly architecture dependent even in pure mode. Only /clr:safe can produce architecture-independent assemblies, and you can't use that with this code since it contains unverifiable constructs like pointers.

Also please note not all features of C++/CLI are supported in .Net Core 2.x. This particular example avoids unsupported bits, but more complex ones may not.

Mitchum answered 8/7, 2018 at 18:2 Comment(2)
Thanks for your answer. Maybe you could clarify a point for me.? A single bare pointer works fine. It's only when a second level of indirection is involved that things go bad. Why is that?Aegeus
SInce b is a local, the IL uses ldloc to get its value, and since b is typed as float*, this always gives the right sized pointer. Likewise when loading c the IL gets the right sized pointer, but loading *c is where the IL has to tell the JIT what size to use, and here it always specifies 4 bytes for an x86 build and 8 for an x64 build. To put it another way, sometimes the size of the value to load is implicit (specified either by the opcode, ldloca say, or via a signature), and sometimes it is explicit. In the explicit case the IL producer must get the size right.Mitchum

© 2022 - 2024 — McMap. All rights reserved.