Variables ending with "1" have the "1" removed within ILSpy. Why?
Asked Answered
P

2

14

In an effort to explore how the C# compiler optimizes code, I've created a simple test application. With each test change, I've compiled the application and then opened the binary in ILSpy.

I just noticed something that, to me, is weird. Obviously this is intentional, however, I can't think of a good reason why the compiler would do this.

Consider the following code:

static void Main(string[] args)
{
    int test_1 = 1;
    int test_2 = 0;
    int test_3 = 0;

    if (test_1 == 1) Console.Write(1);
    else if (test_2 == 1) Console.Write(1);
    else if (test_3 == 1) Console.Write(2);
    else Console.Write("x");
}

Pointless code, but I had written this to see how ILSpy would interpret the if statements.

However, when I compiled/decompiled this code, I did notice something that had me scratching my head. My first variable test_1 was optimized to test_! Is there a good reason why the C# compiler would do this?

For full inspection this is the output of Main() that I'm seeing in ILSpy.

private static void Main(string[] args)
{
    int test_ = 1; //Where did the "1" go at the end of the variable name???
    int test_2 = 0;
    int test_3 = 0;
    if (test_ == 1)
    {
        Console.Write(1);
    }
    else
    {
        if (test_2 == 1)
        {
            Console.Write(1);
        }
        else
        {
            if (test_3 == 1)
            {
                Console.Write(2);
            }
            else
            {
                Console.Write("x");
            }
        }
    }
}

UPDATE

Apparently after inspecting the IL, this is an issue with ILSpy, not the C# compiler. Eugene Podskal has given a good answer to my initial comments and observations. However, I am interested in knowing if this is rather a bug within ILSpy or if this is intentional functionality.

Penna answered 5/9, 2014 at 18:27 Comment(8)
Yes, they are. This is why you can compile/decompile assemblies and see full code for whatever specified decompiled code. This is done by IL interpreters like ILSpy which can output compiled .Net assemblies to any .Net language of your choosing.Penna
That must never happen, atleast not by itself or by the compiler.Hays
Let me check the IL in ildasm...Penna
@AfzaalAhmadZeeshan I was able to reproduce this using the posted code.Benco
@user2864740, use ILDASM and look at the declaration of the .locals init routine. Each variable is, in fact, declared and is added to an array of objects. Subsequent objects are referenced by their position within that index, not their name.Penna
It is an issue with the ILSpy or more generally with the decompilers. dotPeek shows num1, num2 num3 for variable namesBaer
@Penna It is completely right that you do not mark my answer, because it ... doesn't answer the question - WHY variable names are changed?Titos
Ok, I slightly-modified my update, since you're correct. Thanks for understanding.Penna
T
5

Well, it is a bug. Not much of a bug, fairly unlikely that anybody ever filed a bug report for it. Do note that Eugene's answer is very misleading. ildasm.exe is smart enough to know how to locate the PDB file for an assembly and retrieve debugging info for the assembly. Which includes the names of local variables.

This is not normally a luxury available to a disassembler. Those names are not actually present in the assembly itself and they invariably have to make-do without the PDB. Something you can see in ildasm.exe as well, just delete the .pdb files in the obj\Release and bin\Release directories and it now looks like this:

.method private hidebysig static void  Main(string[] args) cil managed
{
  .entrypoint
  // Code size       50 (0x32)
  .maxstack  2
  .locals init (int32 V_0,
           int32 V_1,
           int32 V_2)
  IL_0000:  ldc.i4.1
  // etc...

Names like V_0, V_1 etcetera are of course not great, a disassembler usually comes up with something better. Something like "num".

So, kinda clear where the bug in ILSpy is located, it too reads the PDB file but fumbles the symbol it retrieves. You could file the bug with the vendor, pretty unlikely they'll treat it as a high-priority bug however.

Terret answered 6/9, 2014 at 15:11 Comment(1)
Thanks Hans. Your answers are always quite informative!Penna
T
15

It is probably some problem with decompiler. Because IL is correct on .NET 4.5 VS2013:

.entrypoint
  // Code size       79 (0x4f)
  .maxstack  2
  .locals init ([0] int32 test_1,
           [1] int32 test_2,
           [2] int32 test_3,
           [3] bool CS$4$0000)
  IL_0000:  nop
  IL_0001:  ldc.i4.1
  IL_0002:  stloc.0

edit: it uses data from .pdb file(see this answer) to get correct name variables. Without pdb it will have variables in form V_0, V_1, V_2.

EDIT:

Variable name mangles in the file NameVariables.cs in method:

public string GetAlternativeName(string oldVariableName)
{
    if (oldVariableName.Length == 1 && oldVariableName[0] >= 'i' && oldVariableName[0] <= maxLoopVariableName) {
        for (char c = 'i'; c <= maxLoopVariableName; c++) {
            if (!typeNames.ContainsKey(c.ToString())) {
                typeNames.Add(c.ToString(), 1);
                return c.ToString();
            }
        }
    }

    int number;
    string nameWithoutDigits = SplitName(oldVariableName, out number);

    if (!typeNames.ContainsKey(nameWithoutDigits)) {
        typeNames.Add(nameWithoutDigits, number - 1);
    }

    int count = ++typeNames[nameWithoutDigits];

    if (count != 1) {
        return nameWithoutDigits + count.ToString();
    } else {
        return nameWithoutDigits;
    }
}

NameVariables class uses this.typeNames dictionary to store names of variables without ending number (such variables mean something special to ILSpy, or perhaps even to IL, but I actually doubt it) associated with counter of their appearances in the method to decompile.

It means that all variables (test_1, test_2, test_3) will end in one slot ("test_") and for the first one count var will be one, resulting in execution:

else {
    return nameWithoutDigits;
}

where nameWithoutDigits is test_

EDIT

First, thanks @HansPassant and his answer for pointing the fault in this post.

So, the source of the problem:

ILSpy is as smart as ildasm, because it also uses .pdb data (or how else does it get test_1, test_2 names at all). But its inner workings are optimized for use with assemblies without any debug related info, hence its optimizations related to dealing with V_0, V_1, V_2 variables works inconsistently with the wealth of metadata from .pdb file.

As I understand, the culprit is an optimization to remove _0 from lone variables.

Fixing it will probably require propagating of the fact of .pdb data usage into the variable name generations code.

Titos answered 5/9, 2014 at 18:34 Comment(4)
Thank you. This is the answer-- I should have taken this a step further. I just found this, myself, when I checked the IL. :/ Guess this is a bug within ILSpy.Penna
Seeing this, we're probably seeing a case where the decompiler is removing the '1' because it seems certain underscored names as "maical" - such as the otherwise anonymouse fields generated by 'property {get; set;}' or lambdasWolfsbane
I would delete this question but since many people use ILSpy, I'll leave it for future reference. You'll get the answer mark in about 5 minutes.Penna
@EugenePodskal I've updated my question to reflect the curiosity within ILSpy. Also, someone has, in fact, downvoted me? Sometimes I don't understand.Penna
T
5

Well, it is a bug. Not much of a bug, fairly unlikely that anybody ever filed a bug report for it. Do note that Eugene's answer is very misleading. ildasm.exe is smart enough to know how to locate the PDB file for an assembly and retrieve debugging info for the assembly. Which includes the names of local variables.

This is not normally a luxury available to a disassembler. Those names are not actually present in the assembly itself and they invariably have to make-do without the PDB. Something you can see in ildasm.exe as well, just delete the .pdb files in the obj\Release and bin\Release directories and it now looks like this:

.method private hidebysig static void  Main(string[] args) cil managed
{
  .entrypoint
  // Code size       50 (0x32)
  .maxstack  2
  .locals init (int32 V_0,
           int32 V_1,
           int32 V_2)
  IL_0000:  ldc.i4.1
  // etc...

Names like V_0, V_1 etcetera are of course not great, a disassembler usually comes up with something better. Something like "num".

So, kinda clear where the bug in ILSpy is located, it too reads the PDB file but fumbles the symbol it retrieves. You could file the bug with the vendor, pretty unlikely they'll treat it as a high-priority bug however.

Terret answered 6/9, 2014 at 15:11 Comment(1)
Thanks Hans. Your answers are always quite informative!Penna

© 2022 - 2024 — McMap. All rights reserved.