Compiled expression tree gives different result then the equivalent code
Asked Answered
T

2

6

The following code:

double c1 = 182273d;
double c2 = 0.888d;
Expression c1e = Expression.Constant(c1, typeof(double));
Expression c2e = Expression.Constant(c2, typeof(double));
Expression<Func<double, double>> sinee = a => Math.Sin(a);
Expression sine = ((MethodCallExpression)sinee.Body).Update(null, new[] { c1e });
Expression sum = Expression.Add(sine, c2e);
Func<double> f = Expression.Lambda<Func<double>>(sum).Compile();
double r = f();
double rr = Math.Sin(c1) + c2;
Console.WriteLine(r.ToString("R"));
Console.WriteLine(rr.ToString("R"));

Will output:

0.082907514933846488
0.082907514933846516

Why r and rr are different?

Update:

Found that this is reproduced if to select "x86" platform target or to check "Prefer 32-bit" with "Any CPU". In 64x mode works correctly.

Terrorstricken answered 12/3, 2017 at 12:0 Comment(3)
If I copy paste your code - I cannot reproduce your output (I'm getting two identical values).Absa
Yeah, it generates two the same values for me too. Here is working sample - dotnetfiddle.net/Fs8UQfCupellation
Thanks for quick feedback. Please see my update.Terrorstricken
A
5

I'm not an expert on such things, but I'll give my view on this.

First, problem appears only if compile with debug flag (in release mode it does not appear), and indeed only if run as x86.

If we decompile method to which your expression compiles, we will see this (in both debug and release):

IL_0000: ldc.r8       182273 // push first value
IL_0009: call         float64 [mscorlib]System.Math::Sin(float64) // call Math.Sin()
IL_000e: ldc.r8       0.888 // push second value
IL_0017: add          // add
IL_0018: ret 

However, if we look at IL code of similar method compiled in debug mode we will see:

.locals init (
  [0] float64 V_0
)
IL_0001: ldc.r8       182273
IL_000a: call         float64 [mscorlib]System.Math::Sin(float64)
IL_000f: ldc.r8       0.888
IL_0018: add          
IL_0019: stloc.0      // save to local
IL_001a: br.s         IL_001c // basically nop
IL_001c: ldloc.0      // V_0 // pop from local to stack
IL_001d: ret          // return

You see that compiler added (unnecessary) save and load of result to a local variable (probably for debugging purposes). Now here I'm not sure, but as far as I read, on x86 architecture, double values might be stored in 80-bit CPU registers (quote from here):

By default, in code for x86 architectures the compiler uses the coprocessor's 80-bit registers to hold the intermediate results of floating-point calculations. This increases program speed and decreases program size. However, because the calculation involves floating-point data types that are represented in memory by less than 80 bits, carrying the extra bits of precision—80 bits minus the number of bits in a smaller floating-point type—through a lengthy calculation can produce inconsistent results.

So my guess would be that this storage to local and load from local causes conversion from 64-bit to 80-bit (because of register) and back, which causes behavior you observe.

Another explanation might be that JIT behaves differentely between debug and release modes (might still be related to storing intermediate computation results in 80-bit registers).

Hopefully some people who know more can confirm if I'm right or not on this.

Update in response to comment. One way to decompile expression is to create dynamic assembly, compile expression to a method there, save to disk, then look with any decompiler (I use JetBrains DotPeek). Example:

 var asm = AppDomain.CurrentDomain.DefineDynamicAssembly(
     new AssemblyName("dynamic_asm"),
     AssemblyBuilderAccess.Save);

 var module = asm.DefineDynamicModule("dynamic_mod", "dynamic_asm.dll");
 var type = module.DefineType("DynamicType");
 var method = type.DefineMethod(
     "DynamicMethod", MethodAttributes.Public | MethodAttributes.Static);
 Expression.Lambda<Func<double>>(sum).CompileToMethod(method);
 type.CreateType();
 asm.Save("dynamic_asm.dll");
Absa answered 12/3, 2017 at 16:49 Comment(6)
Great! Thanks for the investigation.Terrorstricken
Can I ask you how do you decompile compiled expression?Terrorstricken
I don't think the difference is caused by the store+load in IL. If I decompile Console.WriteLine((Math.Sin(182273d) + 0.888d).ToString("R")); in LinqPad in Debug mode, I do get the different result (ending 516). I think the difference is in how the JIT behaves in Debug vs. Release mode on x86.Luffa
@Luffa that's very well possible, I also have doubts about that store-load part.Absa
@Luffa on the other hand, if compile Console.WriteLine((Math.Sin(182273d) + 0.888d).ToString("R")); in debug mode (just Main method with this one line) and view generated IL - you will see exact same save to local followed by load, so it does not disprove my guess.Absa
@Absa Yeah, but in that case, it's the same in Release mode.Luffa
L
3

As has already been said, this is because of a difference between the Debug and Release modes on x86. It surfaced in your code in Debug mode, because the compiled lambda expression is always JIT compiled in Release mode.

The difference is not caused by the C# compiler. Consider the following version of your code:

using System;
using System.Runtime.CompilerServices;

static class Program
{
    static void Main() => Console.WriteLine(Compute().ToString("R"));

    [MethodImpl(MethodImplOptions.NoInlining)]
    static double Compute() => Math.Sin(182273d) + 0.888d;
}

The output is 0.082907514933846516 in Debug mode and 0.082907514933846488 in Release mode, but the IL is the same for both:

.class private abstract sealed auto ansi beforefieldinit Program
    extends [mscorlib]System.Object
{
  .method private hidebysig static void Main() cil managed 
  {
    .entrypoint
    .maxstack 2
    .locals init ([0] float64 V_0)

    IL_0000: call         float64 Program::Compute()
    IL_0005: stloc.0      // V_0
    IL_0006: ldloca.s     V_0
    IL_0008: ldstr        "R"
    IL_000d: call         instance string [mscorlib]System.Double::ToString(string)
    IL_0012: call         void [mscorlib]System.Console::WriteLine(string)
    IL_0017: ret          
  }

  .method private hidebysig static float64 Compute() cil managed noinlining 
  {
    .maxstack 8

    IL_0000: ldc.r8       182273
    IL_0009: call         float64 [mscorlib]System.Math::Sin(float64)
    IL_000e: ldc.r8       0.888
    IL_0017: add          
    IL_0018: ret          
  }
}

The difference lies in the generated machine code. Disassembly of Compute for Debug mode is:

012E04B2  in          al,dx  
012E04B3  push        edi  
012E04B4  push        esi  
012E04B5  push        ebx  
012E04B6  sub         esp,34h  
012E04B9  xor         ebx,ebx  
012E04BB  mov         dword ptr [ebp-10h],ebx  
012E04BE  mov         dword ptr [ebp-1Ch],ebx  
012E04C1  cmp         dword ptr ds:[1284288h],0  
012E04C8  je          012E04CF  
012E04CA  call        71A96150  
012E04CF  fld         qword ptr ds:[12E04F8h]  
012E04D5  sub         esp,8  
012E04D8  fstp        qword ptr [esp]  
012E04DB  call        71C87C80  
012E04E0  fstp        qword ptr [ebp-40h]  
012E04E3  fld         qword ptr [ebp-40h]  
012E04E6  fadd        qword ptr ds:[12E0500h]  
012E04EC  lea         esp,[ebp-0Ch]  
012E04EF  pop         ebx  
012E04F0  pop         esi  
012E04F1  pop         edi  
012E04F2  pop         ebp  
012E04F3  ret  

For Release mode:

00C204A0  push        ebp  
00C204A1  mov         ebp,esp  
00C204A3  fld         dword ptr ds:[0C204B8h]  
00C204A9  fsin  
00C204AB  fadd        qword ptr ds:[0C204C0h]  
00C204B1  pop         ebp  
00C204B2  ret  

Apart from using a function call to compute sin instead of using fsin directly, which doesn't seem to make a difference, the main change is that Release mode keeps the result of the sin in the floating point register, while Debug mode writes and then reads it into memory (instructions fstp qword ptr [ebp-40h] and fld qword ptr [ebp-40h]). What this does is that it rounds the result of the sin from the 80-bit precision to 64-bit precision, resulting in different values.

Curiously, the result of the same code on .Net Core (x64) is yet another value: 0.082907514933846627. The disassembly for that case shows that it's using SSE instructions, rather than x87 (though .Net Framework x64 does the same, so the difference is going to be in the called function):

00007FFD5C180B80  sub         rsp,28h  
00007FFD5C180B84  movsd       xmm0,mmword ptr [7FFD5C180BA0h]  
00007FFD5C180B8C  call        00007FFDBBEC1C30  
00007FFD5C180B91  addsd       xmm0,mmword ptr [7FFD5C180BA8h]  
00007FFD5C180B99  add         rsp,28h  
00007FFD5C180B9D  ret  
Luffa answered 13/3, 2017 at 15:15 Comment(1)
Thanks for investigation, really interesting to read. Glad that my guess about 80-bit register rounding was basically correct (though assumption that it is caused by different IL instructions in debug and release was incorrect).Absa

© 2022 - 2024 — McMap. All rights reserved.