Generate tail call opcode
Asked Answered
L

3

43

Out of curiosity I was trying to generate a tail call opcode using C#. Fibinacci is an easy one, so my c# example looks like this:

    private static void Main(string[] args)
    {
        Console.WriteLine(Fib(int.MaxValue, 0));
    }

    public static int Fib(int i, int acc)
    {
        if (i == 0)
        {
            return acc;
        }

        return Fib(i - 1, acc + i);
    }

If I build it in release and run it without debugging I do not get a stack overflow. Debugging or running it without optimizations and I do get a stack overflow, implying that tail call is working when in release with optimizations on (which is what I expected).

The MSIL for this looks like this:

.method public hidebysig static int32 Fib(int32 i, int32 acc) cil managed
{
    // Method Start RVA 0x205e
    // Code Size 17 (0x11)
    .maxstack 8
    L_0000: ldarg.0 
    L_0001: brtrue.s L_0005
    L_0003: ldarg.1 
    L_0004: ret 
    L_0005: ldarg.0 
    L_0006: ldc.i4.1 
    L_0007: sub 
    L_0008: ldarg.1 
    L_0009: ldarg.0 
    L_000a: add 
    L_000b: call int32 [ConsoleApplication2]ConsoleApplication2.Program::Fib(int32,int32)
    L_0010: ret 
}

I would've expected to see a tail opcode, per the msdn, but it's not there. This got me wondering if the JIT compiler was responsible for putting it in there? I tried to ngen the assembly (using ngen install <exe>, navigating to the windows assemblies list to get it) and load it back up in ILSpy but it looks the same to me:

.method public hidebysig static int32 Fib(int32 i, int32 acc) cil managed
{
    // Method Start RVA 0x3bfe
    // Code Size 17 (0x11)
    .maxstack 8
    L_0000: ldarg.0 
    L_0001: brtrue.s L_0005
    L_0003: ldarg.1 
    L_0004: ret 
    L_0005: ldarg.0 
    L_0006: ldc.i4.1 
    L_0007: sub 
    L_0008: ldarg.1 
    L_0009: ldarg.0 
    L_000a: add 
    L_000b: call int32 [ConsoleApplication2]ConsoleApplication2.Program::Fib(int32,int32)
    L_0010: ret 
}

I still don't see it.

I know F# handles tail call well, so I wanted to compare what F# did with what C# did. My F# example looks like this:

let rec fibb i acc =  
    if i = 0 then
        acc
    else 
        fibb (i-1) (acc + i)


Console.WriteLine (fibb 3 0)

And the generated IL for the fib method looks like this:

.method public static int32 fibb(int32 i, int32 acc) cil managed
{
    // Method Start RVA 0x2068
    // Code Size 18 (0x12)
    .custom instance void [FSharp.Core]Microsoft.FSharp.Core.CompilationArgumentCountsAttribute::.ctor(int32[]) = { int32[](Mono.Cecil.CustomAttributeArgument[]) }
    .maxstack 5
    L_0000: nop 
    L_0001: ldarg.0 
    L_0002: brtrue.s L_0006
    L_0004: ldarg.1 
    L_0005: ret 
    L_0006: ldarg.0 
    L_0007: ldc.i4.1 
    L_0008: sub 
    L_0009: ldarg.1 
    L_000a: ldarg.0 
    L_000b: add 
    L_000c: starg.s acc
    L_000e: starg.s i
    L_0010: br.s L_0000
}

Which, according to ILSpy, is equivalent to this:

[Microsoft.FSharp.Core.CompilationArgumentCounts(Mono.Cecil.CustomAttributeArgument[])]
public static int32 fibb(int32 i, int32 acc)
{
    label1:
    if !(((i != 0))) 
    {
        return acc;
    }
    (i - 1);
    i = acc = (acc + i);;
    goto label1;
}

So F# generated tail call using goto statements? This isn't what I was expecting.

I'm not trying to rely on tail call anywhere, but I am just curious where exactly does that opcode get set? How is C# doing this?

Liqueur answered 7/4, 2013 at 16:22 Comment(13)
I don't believe C# ever does tail call optimizationHailee
It looks like the JIT is performing the tail call optimization opportunistically, which is a bit weird, considering that the actual call tree is used by Code Access Security, one wouldn't expect that the JIT is allowed to adjust it without a tailcail notation in the MSIL.Yoicks
F# (like IronScheme) uses tail call elimination to change the 'expensive' tail call to a 'cheap' local jump. This is done in the compiler.Cm
@Hailee it definitely does something, since Ivan do Fibonacci of int max without overflowLiqueur
@leppie, right, and the compiler emits IL. Which is what I'm trying to understandLiqueur
@devshorts: The optimization is called tail call elimination, hence you wont see it (the tail opcode or the call). You will need a more complex example to prevent such an optimization.Cm
@benvoigt, how come ngen didn't trigger that code? I thought ngen would force a jitLiqueur
@devshorts: The JIT doesn't change the MSIL. It generates machine code. See Hans's answer where he looks at the output from the JIT and finds the tail call has been converted to a jump.Yoicks
AFAIK tail. call generation depends also on whether you compile for x86 or x64. Last time I read about this, tail call optimization did not work when targeting x64.Selfish
@stakx I think it's the other way around: the x64 JIT will sometimes make the tail call optimization even without tail..Annulation
@svick: I stand corrected then. My memory might well be a little off on that topic, but I remember there definitely being a difference between the two targets.Selfish
See blogs.msdn.com/b/fsharpteam/archive/2011/07/08/… for more low-level details on how F# handles tail calls.Supersensual
@kvb, that article was perfect. Thank you for sharing!Liqueur
T
51

C# compiler does not give you any guarantees about tail-call optimizations because C# programs usually use loops and so they do not rely on the tail-call optimizations. So, in C#, this is simply a JIT optimization that may or may not happen (and you cannot rely on it).

F# compiler is designed to handle functional code that uses recursion and so it does give you certain guarantees about tail-calls. This is done in two ways:

  • if you write a recursive function that calls itself (like your fib) the compiler turns it into a function that uses loop in the body (this is a simple optimization and the produced code is faster than using a tail-call)

  • if you use a recursive call in a more complex position (when using continuation passing style where function is passed as an argument), then the compiler generates a tail-call instruction that tells the JIT that it must use a tail call.

As an example of the second case, compile the following simple F# function (F# does not do this in Debug mode to simplify debugging, so you may need Release mode or add --tailcalls+):

let foo a cont = cont (a + 1)

The function simply calls the function cont with the first argument incremented by one. In continuation passing style, you have a long sequence of such calls, so the optimization is crucial (you simply cannot use this style without some handling of tail calls). The generates IL code looks like this:

IL_0000: ldarg.1
IL_0001: ldarg.0
IL_0002: ldc.i4.1
IL_0003: add
IL_0004: tail.                          // Here is the 'tail' opcode!
IL_0006: callvirt instance !1 
  class [FSharp.Core] Microsoft.FSharp.Core.FSharpFunc`2<int32, !!a>::Invoke(!0)
IL_000b: ret
Tenenbaum answered 7/4, 2013 at 17:1 Comment(4)
I think the main reason the C# compiler doesn’t use tail-call is because it would make stack traces look “wrong”.Dealing
@Dealing That is actually a reason why tailcalls are disabled by default in Debug mode in F#...Tenenbaum
Sure, but stack traces aren’t just for Debug mode. Customers run your program (and generate exception stack traces) in Release mode.Dealing
@Dealing The stack trace will still be there and still be helpful, I imagine this is probably geared more towards stepping through (which you probably won't be doing in release mode) while debugging than a difference in reasoning power from the final trace.Huoh
A
27

The situation with tail call optimization in .Net is quite complicated. As far as I know, it's like this:

  • The C# compiler will never emit the tail. opcode and it will also never do the tail call optimization by itself.
  • The F# compiler sometimes emits the tail. opcode and sometimes does the tail call optimization by itself by emitting IL that's not recursive.
  • The CLR will honor the tail. opcode if it's present and the 64-bit CLR will sometimes make the tail call optimization even when the opcode is not present.

So, in your case, you didn't see the tail. opcode in the IL generated by the C# compiler, because it doesn't do that. But the method was tail-call optimized, because the CLR sometimes does that even without the opcode.

And in the F# case, you observed that the f# compiler did the optimization by itself.

Annulation answered 7/4, 2013 at 17:15 Comment(1)
Thanks for the clear and succinct answer, it was very helpful!Liqueur
K
10

Like all optimizations performed in .NET (Roslyn languages), tail call optimization is a job performed by the jitter, not the compiler. The philosophy is that putting the job on the jitter is useful since any language will benefit from it and the normally difficult job of writing and debugging a code optimizer has to be done only once per architecture.

You have to look at the generated machine code to see it being done, Debug + Windows + Disassembly. With the further requirement that you do so by looking at the Release build code that's generated with the Tools + Options, Debugging, General, Suppress JIT optimization unticked.

The x64 code looks like this:

        public static int Fib(int i, int acc) {
            if (i == 0) {
00000000  test        ecx,ecx 
00000002  jne         0000000000000008 
                return acc;
00000004  mov         eax,edx 
00000006  jmp         0000000000000011 
            }

            return Fib(i - 1, acc + i);
00000008  lea         eax,[rcx-1] 
0000000b  add         edx,ecx 
0000000d  mov         ecx,eax 
0000000f  jmp         0000000000000000              // <== here!!!
00000011  rep ret  

Note the marked instruction, a jump instead of a call. That's tail call optimization at work. A quirk in .NET is that the 32-bit x86 jitter does not perform this optimization. Simply a to-do item that they'll probably never get around to. Which did require the F# compiler writers to not ignore the problem and emit Opcodes.Tailcall. You'll find other optimizations performed by the jitter documented in this answer.

Kirwin answered 7/4, 2013 at 16:56 Comment(8)
I think the question is specifically about the tail. opcode, which you didn't even mention.Annulation
How is this a legal optimization? It changes the observable behavior of the program (with respect to stack walking, which is essential to Code Access Security and exception reporting). Well, in this case there is no call to any other function, so CAS can be shown not to be an issue.Yoicks
@Annulation - the question was why his C# code didn't crash. I think I adequately explained that by showing the generated code.Kirwin
@BenVoigt: Why would CAS be relevant between calls in the same assembly? As for exception reporting, see: funcall.blogspot.com/2011/03/tail-recursion-and-debugging.htmlCm
@leppie: I'm not saying that it isn't desirable to eliminate stack frames in most cases; I'm saying it changes the observable behavior (apart from runtime) and therefore is ineligible for optimization. For CAS, doesn't it distinguish between (my caller) and (caller of my caller)? Tail call elimination can change the latter.Yoicks
That's nonsense, method inlining is also a standard optimization.Kirwin
@Hans: True, but the JIT has to consider additional factors before it can remove call frames. Which makes a question like this somewhat more complex.Yoicks
I'm curious what you mean by "optimizations performed in .NET" in the sentence "Like all optimizations performed in .NET (...) is a job performed by the jitter"? By "in .NET" do you mean by jitter? In that case, the sentence is a tautology (and is just confusing). Or do you mean by .NET languages in general? In that case, it is simply wrong, because F# does other optimizations. (I know C# and VB do not, but that's another story...)Tenenbaum

© 2022 - 2024 — McMap. All rights reserved.