Why cannot take address to a nested local function in 64 bit Delphi?
Asked Answered
A

1

20

AS. since closing related questions - more examples added below.

The below simple code (which finds a top-level Ie window and enumerates its children) works Ok with a '32-bit Windows' target platform. There's no problem with earlier versions of Delphi as well:

procedure TForm1.Button1Click(Sender: TObject);

  function EnumChildren(hwnd: HWND; lParam: LPARAM): BOOL; stdcall;
  const
    Server = 'Internet Explorer_Server';
  var
    ClassName: array[0..24] of Char;
  begin
    Assert(IsWindow(hwnd));            // <- Assertion fails with 64-bit
    GetClassName(hwnd, ClassName, Length(ClassName));
    Result := ClassName <> Server;
    if not Result then
      PUINT_PTR(lParam)^ := hwnd;
  end;

var
  Wnd, WndChild: HWND;
begin
  Wnd := FindWindow('IEFrame', nil); // top level IE
  if Wnd <> 0 then begin
    WndChild := 0;
    EnumChildWindows(Wnd, @EnumChildren, UINT_PTR(@WndChild));

    if WndChild <> 0 then
      ..    

end;


I've inserted an Assert to indicate where it fails with a '64-bit Windows' target platform. There's no problem with the code if I un-nest the callback.

I'm not sure if the erroneous values passed with the parameters are just garbage or are due to some mis-placed memory addresses (calling convention?). Is nesting callbacks infact something that I should never do in the first place? Or is this just a defect that I have to live with?

edit:
In response to David's answer, the same code having EnumChildWindows declared with a typed callback. Works fine with 32-bit:

(edit: The below does not really test what David says since I still used the '@' operator. It works fine with the operator, but if I remove it, it indeed does not compile unless I un-nest the callback)

type
  TFNEnumChild = function(hwnd: HWND; lParam: LPARAM): Bool; stdcall;

function TypedEnumChildWindows(hWndParent: HWND; lpEnumFunc: TFNEnumChild;
    lParam: LPARAM): BOOL; stdcall; external user32 name 'EnumChildWindows';

procedure TForm1.Button1Click(Sender: TObject);

  function EnumChildren(hwnd: HWND; lParam: LPARAM): BOOL; stdcall;
  const
    Server = 'Internet Explorer_Server';
  var
    ClassName: array[0..24] of Char;
  begin
    Assert(IsWindow(hwnd));            // <- Assertion fails with 64-bit
    GetClassName(hwnd, ClassName, Length(ClassName));
    Result := ClassName <> Server;
    if not Result then
      PUINT_PTR(lParam)^ := hwnd;
  end;

var
  Wnd, WndChild: HWND;
begin
  Wnd := FindWindow('IEFrame', nil); // top level IE
  if Wnd <> 0 then begin
    WndChild := 0;
    TypedEnumChildWindows(Wnd, @EnumChildren, UINT_PTR(@WndChild));

    if WndChild <> 0 then
      ..

end;

Actually this limitation is not specific to a Windows API callbacks, but the same problem happens when taking address of that function into a variable of procedural type and passing it, for example, as a custom comparator to TList.Sort.

http://docwiki.embarcadero.com/RADStudio/Rio/en/Procedural_Types

procedure TForm2.btn1Click(Sender: TObject);
var s : TStringList;

  function compare(s : TStringList; i1, i2 : integer) : integer;
  begin
    result := CompareText(s[i1], s[i2]);
  end;

begin
  s := TStringList.Create;
  try
    s.add('s1');
    s.add('s2');
    s.add('s3');
    s.CustomSort(@compare);
  finally
    s.free;
  end;
end;

It works as expected when compiled as 32-bit, but fails with Access Violation when compiled for Win64. For 64-bit version in function compare, s = nil and i2 = some random value;

It also works as expected even for Win64 target, if one extracts compare function outside of btn1Click function.

Asarum answered 15/4, 2012 at 14:9 Comment(5)
If the un-nested code is functionally equivalent to the nested one, then it's either a compiler bug or a callback parameter-passing problem. I vote for the latter, you may be getting stack corruption as the 64-bit calling convention is different than the 32-bit ones, so perhaps "stdcall" is not what you should use here. Try removing it and see if it happens again. Otherwise nesting callbacks is perfectly fine (at least in the way shown here).Haldane
There IS ONLY ONE calling convention in Win64; So whatever calling convention you specify in the code is ignored in 64 bit mode. However your function/procedure/dll-import signature might be wrong, and that might corrupt things. However the inability to have Delphi implement a callback from a function does sound like a compiler bug.Ornithic
@Haldane - The assertion again fails when I remove 'stdcall', I guess the Delphi compiler just ignores it when 64-bit is targeted.Asarum
Okay david's got it. Local functions are probably not invocable this way. Make it a regular non-OOP global method. I would log this in QC as something that should raise a warning. (@EnumChildren is taking the address of a local function.) Maybe not a real compiler bug, but perhaps a warning since it used to work in win32.Ornithic
We have found this to be a problem not just with winapi call backs, but with passing local functions/procedures as call back to our own Delphi code as well, especially when multi-threading is involved. I don't remember the details exactly, but it did have to do with how the compiler treats local methods and what it does / does not insert and what is available on the stack at the time the call back is triggered.Kagera
D
24

This trick was never officially supported by the language and you have been getting away with it to date due to the implementation specifics of the 32 bit compiler. The documentation is clear:

Nested procedures and functions (routines declared within other routines) cannot be used as procedural values.

If I recall correctly, an extra, hidden, parameter is passed to nested functions with the pointer to the enclosing stack frame. This is omitted in 32 bit code if no reference is made to the enclosing environment. In 64 bit code the extra parameter is always passed.

Of course a big part of the problem is that the Windows unit uses untyped procedure types for its callback parameters. If typed procedures were used the compiler could reject your code. In fact I view this as justification for the belief that the trick you used was never legal. With typed callbacks a nested procedure can never be used, even in the 32 bit compiler.

Anyway, the bottom line is that you cannot pass a nested function as parameter to another function in the 64 bit compiler.

Donelson answered 15/4, 2012 at 14:25 Comment(17)
Call it like this: TypedEnumChildWindows(Wnd, EnumChildren, LongWord(@WndChild));Donelson
I see.. But I'm not satisfied with just because the compiler decided to pass hidden something, then that means it's not a compiler defect or that I was getting away with incorrect practice.Asarum
using @callback is the fundamental problem. You have abandoned the type system. Never pass a function pointer that way.Donelson
The thing is I am forced to use the '@' operator although I supply an exact signature for my callback, because the compiler breaks the type system since it can't handle local functions without using hidden parameters (at least on 64-bit).Asarum
No, you've got it the wrong way round. The compiler is telling you that you can't use local functions as callbacks. Using the @ operator is what takes you outside type safety. That is your problem.Donelson
Ok. This pretty much answers if "nesting callbacks is sth. I should never do". Whether I like it or not, the compiler have never infact allowed it (as can be seen with typed functions). Thanks @David! (PS: I still don't think this is anything to do with the type system (in this particular case), but that's beyond what the question asks)Asarum
@Foo is basically like casting to (void*)() in C, and thus, as David says, you're explicitly disabling type checking.Ornithic
@Warren, my point is, I'm explicitly disabling type checking not because my type is not an exact match, but because the compiler complains anyway.Asarum
@SertacAkyuz re "I still don't think this is anything to do with the type system". Where the type system comes into this is that it is the type system that is able to tell you that your local function does not meet the contract required. At least that's my personal viewpoint.Donelson
David is telling you that the implicit parameter causes a signature mismatch at runtime, which the compiler could and did warn you about.Ornithic
@Warren - And I'm telling, then it's the compiler which is breaking the type system of the language, it's the compiler which is inserting implicit something behind my back. (BTW, I think it's not an implicit parameter, I've looked at the memory layout and the correct parameter values are pushed one size of a pointer down, not up)Asarum
@Sertac I updated my answer to a documentation link which makes it clear that the trick that you and so many others have been using was never officially supported.Donelson
@David - Thanks! If I haven't already accepted and voted for your answer I'd do it now <g>.Asarum
@SertacAkyuz I actually posted my answer yesterday whilst out on the fells so I didn't have access to the documentation. Anyway, the continuing thread reminded me to look this up, and there it was!Donelson
Ok, I wasn't being cynical or something if that is how it looked like.Asarum
@sertac Oh I didn't think you were being cynical. I know you too well to ever suspect that! :-)Donelson
It's an implicit parameter alright! The compiler assumes it has its thing in 'rcx' and the parameters to the function are at 'rdx' and 'r8', while in fact there's no 'its thing' and the parameters are at 'rcx' and 'rdx'.Asarum

© 2022 - 2024 — McMap. All rights reserved.