Why can a WideString not be used as a function return value for interop?
Asked Answered
R

2

49

I have, on more than one occasion, advised people to use a return value of type WideString for interop purposes.

The idea is that a WideString is the same as a BSTR. Because a BSTR is allocated on the shared COM heap then it is no problem to allocate in one module and deallocate in a different module. This is because all parties have agreed to use the same heap, the COM heap.

However, it seems that WideString cannot be used as a function return value for interop.

Consider the following Delphi DLL.

library WideStringTest;

uses
  ActiveX;

function TestWideString: WideString; stdcall;
begin
  Result := 'TestWideString';
end;

function TestBSTR: TBstr; stdcall;
begin
  Result := SysAllocString('TestBSTR');
end;

procedure TestWideStringOutParam(out str: WideString); stdcall;
begin
  str := 'TestWideStringOutParam';
end;

exports
  TestWideString, TestBSTR, TestWideStringOutParam;

begin
end.

and the following C++ code:

typedef BSTR (__stdcall *Func)();
typedef void (__stdcall *OutParam)(BSTR &pstr);

HMODULE lib = LoadLibrary(DLLNAME);
Func TestWideString = (Func) GetProcAddress(lib, "TestWideString");
Func TestBSTR = (Func) GetProcAddress(lib, "TestBSTR");
OutParam TestWideStringOutParam = (OutParam) GetProcAddress(lib,
                   "TestWideStringOutParam");

BSTR str = TestBSTR();
wprintf(L"%s\n", str);
SysFreeString(str);
str = NULL;

TestWideStringOutParam(str);
wprintf(L"%s\n", str);
SysFreeString(str);
str = NULL;

str = TestWideString();//fails here
wprintf(L"%s\n", str);
SysFreeString(str);

The call to TestWideString fails with this error:

Unhandled exception at 0x772015de in BSTRtest.exe: 0xC0000005: Access violation reading location 0x00000000.

Similarly, if we try to call this from C# with p/invoke, we have a failure:

[DllImport(@"path\to\my\dll")]
[return: MarshalAs(UnmanagedType.BStr)]
static extern string TestWideString();

The error is:

An unhandled exception of type 'System.Runtime.InteropServices.SEHException' occurred in ConsoleApplication10.exe

Additional information: External component has thrown an exception.

Calling TestWideString via p/invoke works as expected.

So, use pass-by-reference with WideString parameters and mapping them onto BSTR appears to work perfectly well. But not for function return values. I have tested this on Delphi 5, 2010 and XE2 and observe the same behaviour on all versions.

Execution enters the Delphi and fails almost immediately. The assignment to Result turns into a call to System._WStrAsg, the first line of which reads:

CMP     [EAX],EDX

Now, EAX is $00000000 and naturally there is an access violation.

Can anyone explain this? Am I doing something wrong? Am I unreasonable in expecting WideString function values to be viable BSTRs? Or is it just a Delphi defect?

Rollins answered 19/2, 2012 at 13:23 Comment(21)
David, Maybe add C++, C# tags also?Majoriemajority
@Majoriemajority I believe that it's really a question about how Delphi implements return values. I think Delphi is the odd one out.Rollins
@J... I've never seen a COM method that didn't return an HRESULT. I'm not talking about using BSTR in COM though. I'm talking about it as a convenient way to share a heap between different modules.Rollins
@J... Assign to a WideString and it does indeed call SysAllocString. Or it might be SysReallocString but that's morally equivalent.Rollins
@David: Delphi is no more "the odd one out" than any other language. Even different implementations of C don't agree on how this must be done. Returning non-POD types from a function is always a problem when different languages are involved.Discursive
@Rudy BSTR is a POD type. It's just a pointer. Which other language do you know that has this problem?Rollins
@David: I already said it: C has no uniform way to return non-POD types. Sometimes like in Delphi, as reference parameter, sometimes in one or even two registers, sometimes on some kind of stack, etc. Other languages may also use one of these. And BSTR points to a type that also has a preceding length dword. That is why it can't be treated like a normal PWideChar, just like AnsiString or UnicodeString can't be treated like a normal P(Wide)Char, even if they actually are.Discursive
@rudy BSTR is a POD so your comments don't apply here. I can return a TBStr no probs. It's just a PWideChar for the sake of parameter passing.Rollins
@DavidHeffernan, I think that WideString is not a POD because it involves the SysAllocString "magic". I still don't understand the explanation in the accepted answer.Majoriemajority
@Majoriemajority That's akin to saying that THandle is not POD because you need to call a special function to make one. The issue, to the best of my understanding is a combination of automatic management of BSTR via the WideString compiler magic, and the semantics of return values being INOUT parameters. I have to say, it makes no sense to me that return values have IN semantics.Rollins
@DavidHeffernan, so procedure TestWideStringOutParam(var str: WideString); stdcall (note the var) wont work? or am I still getting it wrong? (because it does work)Majoriemajority
@Majoriemajority In that code you can have the same parameter semantics at both ends. Incidentally you called it OUT but it is in fact INOUT. With the function return value, all languages that I know, other than Delphi, treat the return value as an OUT. But Delphi treats it as INOUT and therefore assume that the BSTR pointer is valid on entry.Rollins
@DavidHeffernan, thanks for clearing that out. still need to digest it though ;) I had the same conclusion only I was not able to explain why...Majoriemajority
@David: A BSTR (and thus WideString) is not a POD, just like a UnicodeString (also in fact just a pointer) is not a POD.Discursive
@Rudy All C types are POD. Since BSTR is a C type, it is a POD. The issue is not related to the type. The issue is with the mismatch in the semantics of return values between languages. Sadly Delphi is the odd one out here.Rollins
No, @David, Delphi is NOT the odd one out here. How items (even scalars) are returned even differs between implementations of C.Discursive
FWIW, @David, I'm sure it DOES work for BSTR. Just not for WideString. And being a COM-managed type means that it is not a simple POD type, no more than AnsiString, which you would not return from a DLL to be used in C code.Discursive
@rudy 1. Which C implementations on Windows differ in this respect? 2. BSTR is POD. WideString is described in documentation as being compatible with BSTR. In this regard it is not.Rollins
@David: several. GNU C++, VC++, C++Builder, etc. differ in this respect, even for POD types. VC++ will return e.g. structs (even POD structs) in EAX:EDX or RDX when they are up to 64 bit in size, otherwise it takes the same approach as C++Builder. GNU takes AFAIK the same approach as C++Builder. But also scalar types like int64_t or float are handled differently. These are the types with which I had problems when converting. There are more. Fact is that the implementations don't always agree on how items are returned, and that returning anything but simple scalars (and BSTR is not!) is tricky.Discursive
@Rudy BSTR is a simple scalar. It's a pointer. The problem is that function result is an INOUT parameter in Delphi and an OUT parameter in all other tools that count. I'm repeating myself. I understand what you say in your latest comment, and I don't disagree. It's just that it is not pertinent. You are talking about which registers are used for parameter passing. That's not relevant here since there is no such mismatch in my example.Rollins
A related question: #3251327 This is documented weird/non-intuitive behaviour of Delphi.Sovereign
L
26

In regular Delphi functions, the function return is actually a parameter passed by reference, even though syntactically it looks and feels like an 'out' parameter. You can test this out like so (this may be version dependent):

function DoNothing: IInterface;
begin
  if Assigned(Result) then
    ShowMessage('result assigned before invocation')
  else
    ShowMessage('result NOT assigned before invocation');
end;

procedure TestParameterPassingMechanismOfFunctions;
var
  X: IInterface;
begin
  X := TInterfaceObject.Create;
  X := DoNothing; 
end;

To demonstrate call TestParameterPassingMechanismOfFunctions()

Your code is failing because of a mismatch between Delphi and C++'s understanding of the calling convention in relation to the passing mechanism for function results. In C++ a function return acts like the syntax suggests: an out parameter. But for Delphi it is a var parameter.

To fix, try this:

function TestWideString: WideString; stdcall;
begin
  Pointer(Result) := nil;
  Result := 'TestWideString';
end;
Lessor answered 19/2, 2012 at 14:3 Comment(16)
That sounds plausible, but Pointer(result) := nil itself raises an AV.Rollins
For functions, Delphi stores the pointer to the result in EAX. This pretty much explains it. From Delphi's point of view you cant pass in "no variable" as a var parameter.Lessor
I am not sure if you want just an explaination or a work-around. You probably dont need a work-around because, as you have already stated, the explicit out parameter mechanism works.Lessor
Explanation is just fine. Workarounds abound. I suppose I'm curious as to why Pointer(Result) := nil fails with AV.Rollins
Pointer(Result) := nil fails with AV because the reference to "Result" is a dangling pointer. It is dangling because C++ doesnt see the point of setting it because it has a different model of the call.Lessor
@SeanB.Durkin But I should be able to assign nil to a dangling pointer.Rollins
It is dangling because C++ is not passing a viable WideString instance to the Delphi Result parameter, that is why the Delphi code is failing. WideString is a built in Delphi type that wraps a BSTR. It is not a BSTR by itself. In C++, WideString is a class type. You need to change the C++ function declaration to return a WideString instead of a BSTR. That way, the C++ compiler generates different machine code to pass the WideString to Delphi, and the Delphi code receives a viable WideString to operate on.Ephrem
@Remy That assumes Emba C++ compiler. I'm using MS. Also would like to use C# and pinvoke with MarshalAs. But I don't buy what you say. Why does Pointer(Result) := nil throw AV? And if WideString is not binary compatible with BSTR, how can we do what we do with parameters. I guess I must be missing something.Rollins
Pointer(Result) := nil throws an AV because the return type is actually a pointer to a WideString (hidden out paramerter). And by assigning it nil, the pointer (that was never handled over by C++) is deferenced: mov eax,[ebp+$08]; xor edx,edx; mov [eax],edx. In other words: WideString return values are always passed as hidden out parameters. Delphi doesn't allow you to change that behavior.Cru
However, it may be possible to trick Delphi by returning a PWideChar: (untested) function TestWideString: PWideChar; stdcall; var RealResult: WideString absolute Result; begin Initialize(RealResult); RealResult := 'TestWideString'; end;Kutaisi
@AndreasHausladen Why is it no good with a hidden out but fine with a normal out? Is that the key? I have to say I am struggling to work out why this is failing. I mean, it doesn't really matter, but I do like to understand what my tools are doing.Rollins
@DavidHeffernan That's because C++ doesn't use that hidden out. In C++, BSTR is a typedef for unsigned short * or wchar_t * (I'm not sure which), a pointer type without any special behaviour from the compiler, and pointer values are returned directly.Kutaisi
@hvd I think the definition of BSTR is a little irrelevant here. After all you can map WideString <--> BSTR for parameters. If you look at the Delphi implementation it's just a pointer to WideChar. The same applies to Remy's comment. The fact that C++ Builder wraps it in a class is surely beside the point. Presumably the only field of the class is a pointer to wchar_t and so binary compatible with BSTR.Rollins
@DavidHeffernan That's just it: it's not "just a pointer to WideChar" in Delphi. There's a lot of compiler magic to automatically call special built-in functions that don't and shouldn't get called for PWideChar. And I expect wrapping BSTR in a class, and returning that class from a function, is also not binary compatible in C++ with returning a BSTR directly.Kutaisi
@hvd I don't buy that argument. Are you saying that because Delphi interfaces have lots of extra magic code around them (automatic reference counting) that they are not binary compatible with COM interfaces. If WideString wasn't binary compatible with BSTR then we wouldn't be able to declare COM methods as taking WideString parameters where the C declarations take BSTR. Clearly WideString and BSTR are binary compatible. Ultimately the data is just a pointer to wide char. The problem appears to be in parameter passing conventions not matching.Rollins
@DavidHeffernan You're right that that part was a poor argument, but I stand by my conclusion. Both WideString and BSTR have the size of a pointer, but that doesn't mean they're always passed the same way. They are close enough so that they're passed the same way for procedure and function arguments, but if the stdcall calling convention returns structures via a hidden out parameter, and WideString is treated as a structure, then it won't be returned the same way as a BSTR (PWideChar).Kutaisi
M
21

In C#/C++ you will need to define the Result as out Parameter, in order to maintain binary code compatibility of stdcall calling conventions:

Returning Strings and Interface References From DLL Functions

In the stdcall calling convention, the function’s result is passed via the CPU’s EAX register. However, Visual C++ and Delphi generate different binary code for these routines.

Delphi code stays the same:

function TestWideString: WideString; stdcall;
begin
  Result := 'TestWideString';
end;

C# code:

// declaration
[DllImport(@"Test.dll")]        
static extern void  TestWideString([MarshalAs(UnmanagedType.BStr)] out string Result);
...
string s;
TestWideString(out s); 
MessageBox.Show(s);
Majoriemajority answered 19/2, 2012 at 22:14 Comment(4)
+1 Yes that does it. I still cannot get my head around what's really going on here though!!Rollins
Note that from my testing it seems that the Result parameter is always first in the list if you have multiple parameters, not last as might be assumed.Ogawa
@JamieKitson I don't understand that comment. If you mean the Delphi implict var parameter that is used to return the function return value, the extra parameter is passed after the others. It's documented clearly here: docwiki.embarcadero.com/RADStudio/en/…Rollins
@DavidHeffernan Perhaps what Jamie observed is that the parameters are passed in reversed order with stdcall, as your link also states (last first). So the "result" parameter, which is the last on the declaration/Delphi side, is passed the first at stub/asm level.Differential

© 2022 - 2024 — McMap. All rights reserved.