Why does the C# compiler remove a chain of method calls when the last one is conditional?
Asked Answered
F

3

71

Consider the following classes:

public class A {
    public B GetB() {
        Console.WriteLine("GetB");
        return new B();
    }
}

public class B {
    [System.Diagnostics.Conditional("DEBUG")]
    public void Hello() {
        Console.WriteLine("Hello");
    }
}

Now, if we were to call the methods this way:

var a = new A();
var b = a.GetB();
b.Hello();

In a release build (i.e. no DEBUG flag), we would only see GetB printed on the console, as the call to Hello() would be omitted by the compiler. In a debug build, both prints would appear.

Now let's chain the method calls:

a.GetB().Hello();

The behavior in a debug build is unchanged; however, we get a different result if the flag isn't set: both calls are omitted and no prints appear on the console. A quick look at IL shows that the whole line wasn't compiled.

According to the latest ECMA standard for C# (ECMA-334, i.e. C# 5.0), the expected behavior when the Conditional attribute is placed on the method is the following (emphasis mine):

A call to a conditional method is included if one or more of its associated conditional compilation symbols is defined at the point of call, otherwise the call is omitted. (§22.5.3)

This doesn't seem to indicate that the entire chain should be ignored, hence my question. That being said, the C# 6.0 draft spec from Microsoft offers a bit more detail:

If the symbol is defined, the call is included; otherwise, the call (including evaluation of the receiver and parameters of the call) is omitted.

The fact that parameters of the call aren't evaluated is well-documented since it's one of the reasons people use this feature rather than #if directives in the function body. The part about "evaluation of the receiver", however, is new - I can't seem to find it elsewhere, and it does seem to explain the above behavior.

In light of this, my question is: what's the rationale behind the C# compiler not evaluating a.GetB() in this situation? Should it really behave differently based on whether the receiver of the conditional call is stored in a temporary variable or not?

Frausto answered 13/3, 2018 at 10:7 Comment(8)
This is just a guess, but I think since you don't keep a reference to B in the method chaining version, the Compiler omits it's creation because it mistakenly "think" you only want to to call the Hello() method. Great question!Keaton
@ZoharPeled Thanks! I noticed the added detail in the C# 6.0 draft later, but the behavior predates this draft. It does seem to indicate that this is the intended behavior, so I'm mostly interested in the rationale behind this now: why is this intended, and why wasn't it documented until fairly recently.Frausto
It makes more sense if you consider this as an additional hidden method parameter (which it is). It simply gets the same treatment as all the other parameters.Pistoleer
@LucasTrzesniewski That's a reasonable explanation - and it'd probably make a better answer than the one that is being upvoted right now.Frausto
I think @Zohar is on the right path. What's the behavior if you compile on release but disable "optimize code"?Swingle
The omission of the bit about not evaluating the receiver was plainly an error in earlier versions of the ECMA spec.Vassallo
I don't have Visual Studio open at the moment to see - but it sure would be sweet if there was some kind of visual indication on the line a.GetB().Hello(); to indicate that it was subject to this vanishing act. Because otherwise that semantics is rather invisible to the reader.Musca
@Musca There is not, sadly! You get warnings only if you use #if.Frausto
T
13

I did some digging and found the C# 5.0 language specification did actually already contain your second quote in section 17.4.2 The Conditional attribute on page 424.

Marc Gravell’s answer already shows that this behaviour is intended and what it means in practice. You also asked about the rationale behind this but seem to be dissatisfied by Marc's mention of removing overhead.

Maybe you wonder why it is considered overhead that can be removed?

a.GetB().Hello(); not being called at all in your scenario with Hello() being omitted might seem odd at face value.

I do not know the rationale behind the decision but I found some plausible reasoning my own. Maybe it can help you as well.

Method chaining is only possible if each previous method has a return value. This makes sense when you want to do something with these values, i.e. a.GetFoos().MakeBars().AnnounceBars();

If you have a function that only does something without returning a value you cannot chain something behind it but can put it at the end of the method chain, as is the case with your conditional method since it has to have the return type void.

Also note that the result of the previous method calls gets thrown away, so in your example of a.GetB().Hello(); your the result from GetB() has no reason to live after this statement is executed. Basically, you imply you need the result of GetB() only to use Hello().

If Hello() is omitted why do you need to GetB() then? If you omit Hello() your line boils down to a.GetB(); without any assignment and many tools will give a warning that you are not using the return value because this is seldomly something you want to do.

The reason why you seem to not be okay with this is your method is not only trying to do what is necessary to return a certain value, but you also have a side effect, namely I/O. If you did instead have a pure function there would really be no reason to GetB() if you omit the subsequent call, i.e. if you are not going to do anything with the result.

If you assign the result of GetB() to a variable, this is a statement on it's own and will be executed anyway. So this reasoning explains why in

var b = a.GetB();
b.Hello();

only the call to Hello() is omitted while when using method chaining the whole chain is omitted.

You can also look somewhere entirely different to get a better perspective: the null-conditional operator or elvis operator ? introduced in C# 6.0. Although it only is syntactic sugar for a more complex expression with null checks it allows you to build something like a method chain with the option to short-circuit based on the null check.

E.g. GetFoos()?.MakeBars()?.AnnounceBars(); will only reach it’s end if the previous methods do not return null, otherwise subsequent calls are omitted.

It might be counter-intuitive but try thinking of your scenario as the inverse of this: the compiler omits your calls prior to Hello() in your a.GetB().Hello(); chain since you are not reaching the end of the chain anyway.


Disclaimer

This has all been armchair reasoning so please take this and the analogy with the elvis operator with a grain of salt.

Turpeth answered 13/3, 2018 at 23:22 Comment(4)
Good digging. The note about the receiver first appeared in the C# 4 spec.Vassallo
While this does seem to answer the question, the logic it reveals is horrifying. The idea that chaining or not chaining a method could have semantic differences... I think I need a new job.Upbeat
@Upbeat The line a.GetB().Hello(); has no return value while omitting Hello() the a.GetB(); suddenly has a return value that just gets thrown away. However, your everyday method chaining will be more like var bars = GetFoos()?.MakeBars() where MakeBars() has a return value. If this short-circuits or you manually remove ?.MakeBars() you still end up with a valid assignment, even though the result might change. So keep in mind that this is an edge case – although a weird one.Wilkins
This can be very sneaky. To be more explicit, if instead of GetB you have myObject.DoImportantStuff().DoDispensableStuff(). I don't want the method DoImportantStuff to be discarded because DoDispensableStuff is conditional. Also, the behavior between myObject.DoImportantStuff().DoDispensableStuff() and myObject.DoImportantStuff()?.DoDispensableStuff() is different depending on the build target, this can be very complicated to detect.Laidlaw
S
63

It comes down to the phrase:

(including evaluation of the receiver and parameters of the call) is omitted.

In the expression:

a.GetB().Hello();

the "evaluation of the receiver" is: a.GetB(). So: that is omitted as per the specification, and is a useful trick allowing [Conditional] to avoid overhead for things that aren't used. When you put it into a local:

var b = a.GetB();
b.Hello();

then the "evaluation of the receiver" is just the local b, but the original var b = a.GetB(); is still evaluated (even if the local b ends up getting removed).

This can have unintended consequences, so: use [Conditional] with great care. But the reasons are so that things like logging and debugging can be trivially added and removed. Note that parameters can also be problematic if treated naively:

LogStatus("added: " + engine.DoImportantStuff());

and:

var count = engine.DoImportantStuff();
LogStatus("added: " + count);

can be very different if LogStatus is marked [Conditional] - with the result that your actual "important stuff" didn't get done.

Sewellyn answered 13/3, 2018 at 10:13 Comment(9)
I'm indeed using this feature to allow debugging while avoiding overhead in a release build. But if you take a closer look at my question, you'll see that since I noticed the added detail in the 6.0 spec, I know that this is (now?) the expected behavior, and I'm more interested in the reason why. This is an important side effect of conditional compilation in C# that wasn't documented before (parameters being ignored is a known fact, as far as I know). Thanks for giving more insight into this, though - that should prove useful for future readers!Frausto
@Frausto I think it has always behaved that way, though - and been intended to behave that way. So the key point is : the specification now makes it more obviousSewellyn
I'm not sure how we can say for certain that this is intended, but if we assume it is, that's still not really answering my question.Frausto
Maybe Eric can shed more light on this.Lamellirostral
@JonH: Marc is correct; the intention was always that the entire statement becomes a no-op. If that was omitted from some version of the spec, then that was an error in the spec. In fact, there was a work item entered into the notes in 2003 that mentions that the ECMA spec needs to be clarified; apparently no one followed up on that note.Vassallo
@EricLippert and that's why we love you; great seeing you in passing last week btw - a shame I didn't manage to catch you, but I suspect you were very busy :)Sewellyn
@Lamellirostral OK, what crazy Beetlejucian incantation did you use to summon Eric there?Sewellyn
Were you at the summit? Next time find me and we can get lunch or something.Vassallo
Marc the force is with me. Well not really but Eric always finds the best questions to answer. When I put in that comment I bet myself he’d find it without pinging him. Sure enough I won that bet...been lucky lately. Tell you what...this isn’t the first time!Lamellirostral
V
19

Should it really behave differently based on whether the receiver of the conditional call is stored in a temporary variable or not?

Yes.

What's the rationale behind the C# compiler not evaluating a.GetB() in this situation?

The answers from Marc and Søren are basically correct. This answer is just to clearly document the timeline.

  • The feature was designed in 1999, and the intention of the feature was always to remove the entire statement.
  • The design notes from 2003 indicate that the design team realized then that the spec was unclear on this point. Up until this point the specification only called out that arguments would not be evaluated. I note that the spec makes the common mistake of calling the arguments "parameters", though of course one could suppose that they meant "actual parameters" rather than "formal parameters".
  • A work item was supposed to be created to fix the ECMA specification on this point; apparently that never happened.
  • The first time that the corrected text appears in any C# specification was the C# 4.0 specification, which I believe was 2010. (I do not recall if this was one of my corrections, or if someone else found it.)
  • If the 2017 ECMA specification does not contain this correction, then that's a mistake which should be fixed in the next release. Better 15 years late than never, I guess.
Vassallo answered 14/3, 2018 at 0:10 Comment(0)
T
13

I did some digging and found the C# 5.0 language specification did actually already contain your second quote in section 17.4.2 The Conditional attribute on page 424.

Marc Gravell’s answer already shows that this behaviour is intended and what it means in practice. You also asked about the rationale behind this but seem to be dissatisfied by Marc's mention of removing overhead.

Maybe you wonder why it is considered overhead that can be removed?

a.GetB().Hello(); not being called at all in your scenario with Hello() being omitted might seem odd at face value.

I do not know the rationale behind the decision but I found some plausible reasoning my own. Maybe it can help you as well.

Method chaining is only possible if each previous method has a return value. This makes sense when you want to do something with these values, i.e. a.GetFoos().MakeBars().AnnounceBars();

If you have a function that only does something without returning a value you cannot chain something behind it but can put it at the end of the method chain, as is the case with your conditional method since it has to have the return type void.

Also note that the result of the previous method calls gets thrown away, so in your example of a.GetB().Hello(); your the result from GetB() has no reason to live after this statement is executed. Basically, you imply you need the result of GetB() only to use Hello().

If Hello() is omitted why do you need to GetB() then? If you omit Hello() your line boils down to a.GetB(); without any assignment and many tools will give a warning that you are not using the return value because this is seldomly something you want to do.

The reason why you seem to not be okay with this is your method is not only trying to do what is necessary to return a certain value, but you also have a side effect, namely I/O. If you did instead have a pure function there would really be no reason to GetB() if you omit the subsequent call, i.e. if you are not going to do anything with the result.

If you assign the result of GetB() to a variable, this is a statement on it's own and will be executed anyway. So this reasoning explains why in

var b = a.GetB();
b.Hello();

only the call to Hello() is omitted while when using method chaining the whole chain is omitted.

You can also look somewhere entirely different to get a better perspective: the null-conditional operator or elvis operator ? introduced in C# 6.0. Although it only is syntactic sugar for a more complex expression with null checks it allows you to build something like a method chain with the option to short-circuit based on the null check.

E.g. GetFoos()?.MakeBars()?.AnnounceBars(); will only reach it’s end if the previous methods do not return null, otherwise subsequent calls are omitted.

It might be counter-intuitive but try thinking of your scenario as the inverse of this: the compiler omits your calls prior to Hello() in your a.GetB().Hello(); chain since you are not reaching the end of the chain anyway.


Disclaimer

This has all been armchair reasoning so please take this and the analogy with the elvis operator with a grain of salt.

Turpeth answered 13/3, 2018 at 23:22 Comment(4)
Good digging. The note about the receiver first appeared in the C# 4 spec.Vassallo
While this does seem to answer the question, the logic it reveals is horrifying. The idea that chaining or not chaining a method could have semantic differences... I think I need a new job.Upbeat
@Upbeat The line a.GetB().Hello(); has no return value while omitting Hello() the a.GetB(); suddenly has a return value that just gets thrown away. However, your everyday method chaining will be more like var bars = GetFoos()?.MakeBars() where MakeBars() has a return value. If this short-circuits or you manually remove ?.MakeBars() you still end up with a valid assignment, even though the result might change. So keep in mind that this is an edge case – although a weird one.Wilkins
This can be very sneaky. To be more explicit, if instead of GetB you have myObject.DoImportantStuff().DoDispensableStuff(). I don't want the method DoImportantStuff to be discarded because DoDispensableStuff is conditional. Also, the behavior between myObject.DoImportantStuff().DoDispensableStuff() and myObject.DoImportantStuff()?.DoDispensableStuff() is different depending on the build target, this can be very complicated to detect.Laidlaw

© 2022 - 2024 — McMap. All rights reserved.