Implementation of nested functions
Asked Answered
O

1

32

I recently found out that gcc allows the definition of nested function. In my opinion, this is a cool feature, but I wonder how to implement it.

While it is certainly not difficult to implement direct calls of nested functions by passing a context pointer as a hidden argument, gcc also allows to take a pointer to a nested function and pass this pointer to an arbitrary other function that in turn can call the nested function of the context. Because the function that calls the nested function has only the type of the nested function to call, it obviously can't pass a context pointer.

I know, that other languages like Haskell that have a more convoluted calling convention allow partial application to support such stuff, but I see no way to do that in C. How is it possible to implement this?

Here is a small example of a case that illustrates the problem:

int foo(int x,int(*f)(int,int(*)(void))) {
  int counter = 0;
  int g(void) { return counter++; }

  return f(x,g);
}

This function calls a function that calls a function that returns a counter from the context and increments it at the same time.

Overgrow answered 18/11, 2011 at 8:16 Comment(3)
I didn't realise you could pass around pointers to nested functions. That's a really good question of how they work - presumably calling the pointer once the outer function has returned leads to bad behaviour?Lifeordeath
@Autopulated That is actually true and logical, since the corresponding stack frame does no longer exist.Overgrow
Hard to call it a "cool" feature.Charlatanism
V
32

GCC uses something called a trampoline.

Information: http://gcc.gnu.org/onlinedocs/gccint/Trampolines.html

A trampoline is a piece of code that GCC creates in the stack to use when you need a pointer to a nested function. In your code, the trampoline is necessary because you pass g as a parameter to a function call. A trampoline initializes some registers so that the nested function can refer to variables in the outer function, then it jumps to the nested function itself. Trampolines are very small -- you "bounce" off a trampoline and into the body of the nested function.

Using nested functions this way requires an executable stack, which is discouraged these days. There is not really any way around it.

Dissection of a trampoline:

Here is an example of a nested function in GCC's extended C:

void func(int (*param)(int));

void outer(int x)
{
    int nested(int y)
    {
        // If x is not used somewhere in here,
        // then the function will be "lifted" into
        // a normal, non-nested function.
        return x + y;
    }
    func(nested);
}

It's very simple so we can see how it works. Here is the resulting assembly of outer, minus some stuff:

subq    $40, %rsp
movl    $nested.1594, %edx
movl    %edi, (%rsp)
leaq    4(%rsp), %rdi
movw    $-17599, 4(%rsp)
movq    %rsp, 8(%rdi)
movl    %edx, 2(%rdi)
movw    $-17847, 6(%rdi)
movw    $-183, 16(%rdi)
movb    $-29, 18(%rdi)
call    func
addq    $40, %rsp
ret

You'll notice that most of what it does is write registers and constants to the stack. We can follow along, and find that at SP+4 it places a 19 byte object with the following data (in GAS syntax):

.word -17599
.int $nested.1594
.word -17847
.quad %rsp
.word -183
.byte -29

This is easy enough to run through a disassembler. Suppose that $nested.1594 is 0x01234567 and %rsp is 0x0123456789abcdef. The resulting disassembly, provided by objdump, is:

   0:   41 bb 67 45 23 01       mov    $0x1234567,%r11d
   6:   49 ba ef cd ab 89 67    mov    $0x123456789abcdef,%r10
   d:   45 23 01 
  10:   49 ff e3                rex.WB jmpq   *%r11

So, the trampoline loads the outer function's stack pointer into %r10 and jumps to the nested function's body. The nested function body looks like this:

movl    (%r10), %eax
addl    %edi, %eax
ret

As you can see, the nested function uses %r10 to access the outer function's variables.

Of course, it's fairly silly that the trampoline is larger than the nested function itself. You could easily do better. But not very many people use this feature, and this way, the trampoline can stay the same size (19 bytes) no matter how large the nested function is.

Final note: At the bottom of the assembly, there is a final directive:

.section        .note.GNU-stack,"x",@progbits

This instructs the linker to mark the stack as executable.

Vingtetun answered 18/11, 2011 at 8:24 Comment(17)
Wouldn't it be possible to put the trampoline on the heap using a malloc and a final free on return?Overgrow
It shouldn't be too hard for function prologues to push an access link on the stack as a mechanism for avoiding trampolines, would it?Phonolite
@Phonolite Using access links does not really solves the problem, if I understood correctly. If you call foo from the parameter function f, which counter gets incremented? Obviously, there is no way to track the stack frame that corresponds to a function pointer without additional information...Overgrow
@FUZxxl: That would leak with longjmp -- and how would you handle NULL return from malloc?Vingtetun
Hrm, then Pascal must have had some additional restrictions on nested functions that allowed access links to be a suitable solution. Never thought about it this much. :)Phonolite
@sarnold: Pascal can do it in a simpler way: simply by using a different calling convention.Vingtetun
@DietrichEpp, AFAIK, Pascal does not have a procedure or a pointer-to-procedure type and cannot pass procedures as arguments.Shaughnessy
@chill: Some dialects of Pascal have function pointers. Nested functions are not standard C anyway.Vingtetun
And now, many years later... "The use of trampolines requires an executable stack, which is a security risk. To avoid this problem, GCC also supports another strategy: using descriptors for nested functions..." It is good that the link to GCC documentation page still point to the same feature.Charlatanism
Can you explain the same things from a symbolic computation perspective ? I.e. to explain how that assembler was generated using some high level concepts (in a concrete way, ideally in lisp-like style).Gumshoe
@alinsoar: From a high-level perspective, if you have a int nested(int y); then GCC is actually emitting something like int nested(void *link, int y); and generating a trampoline int trampoline(int y) { return nested(0x01234567, y); }. That’s really all it is. The trampoline is kept short because it has an optimized sibling call and uses a custom calling convention for the link parameter. However, the assembler for the trampoline itself is almost certainly generated by hand.Vingtetun
@DietrichEpp this algorithm is actually the same from cgi.sice.indiana.edu/~c311/lib/exe/…Gumshoe
@alinsoar: That's actually a different technique. It's also called a trampoline, but it's a different algorithm.Vingtetun
cam You provide a link to the algorithm used?Gumshoe
@alinsoar: Everything should be described in the answer. What else would you like to know?Vingtetun
@DietrichEpp I want to read an article in which it is detailed the internals about how GCC implements nested functions, in which is this detailed what you explained.Gumshoe
@alinsoar: There is not much to explain here--the internals are very GCC-specific, and the algorithm is too simple to warrant additional details (all the algorithm really does is emit code which writes a tiny function to the stack). If you are not familiar with GCC internals you may find this unsatisfying to read. Internals docs: gcc.gnu.org/onlinedocs/gccint/Trampolines.html - code for i386 - github.com/gcc-mirror/gcc/blob/master/gcc/config/i386/i386.c (see ix86_trampoline_init)Vingtetun

© 2022 - 2024 — McMap. All rights reserved.