Inconsistency in C# spec 7.16.2.5

Asked 9/11, 2013 at 14:4 Answered 21/1, 2014 at 13:33

c#language-lawyer roslyn linq-query-syntax

I'm having a go at implementing C# spec 7.16.2 "Query expression translation" in Roslyn. However, I've run into a problem in 7.16.2.5 "Select clauses".

It reads

A query expression of the form
from x in e select v
is translated into
( e ) . Select ( x => v )
except when v is the identifier x, the translation is simply
( e )
For example
from c in customers.Where(c => c.City == "London")
select c
is simply translated into
customers.Where(c => c.City == "London")

My code does not produce a result matching the example, because (as per the "except when" line) I translate from x in e select x into ( e ), rather than just e. Thus my code translates the example into

( customers.Where(c => c.City == "London") )

Is the example in the spec wrong, or do I need to be doing processing to recognise whether the enclosing parentheses are necessary? If so, is this in the spec somewhere?

Similarly, 7.16.2.6 (Groupby clauses) says

A query expression of the form
from x in e group v by k
is translated into
( e ) . GroupBy ( x => k , x => v )
except when v is the identifier x, the translation is
( e ) . GroupBy ( x => k )
The example
from c in customers
group c.Name by c.Country
is translated into
customers.
GroupBy(c => c.Country, c => c.Name)

where again the example result is missing the parentheses suggested by the spec.

Typhoid answered 9/11, 2013 at 14:4 Comment(9)

I think that the parentheses indicate that e is evaluated first and select,groupby as applied later. The example uses a set customers where customers is the same as (customers), but imagine you have an operator + used for concatenating two lists - from x in a + b has to be translated into (a+b).Where. My guess is that () is correct and the example is only simplified version of the same (for this specific case) – Pennywise 9/11, 2013 at 14:24

@Pennywise Yeah, I figured that was why the parentheses are there (and thanks for the nice example of where it would break without them!). I'm just concerned because, as the examples in the spec are simplified, I'm wondering whether I need to figure out how to make my own implementation perform the same simplification. – Typhoid 9/11, 2013 at 14:32

@Typhoid Why would you need to do that? The code with parentheses is always going to be correct. – Wean 9/11, 2013 at 14:46

@Wean Yes, correct as in it'll work, but not "correct" as in "it matches the examples given in the spec"! I'd give the spec ("wrap it in parentheses") precedence over the examples ("don't wrap it in parentheses when unnecessary"), I'm just a bit worried I've missed a bit of the spec that says "this is when parentheses are unnecessary". – Typhoid 9/11, 2013 at 14:55

Whether parentheses are added or not cannot be distinguished by running the program and observing its output. Therefore, this issue does not exist. I also don't think the spec is intending to say that parentheses should be added. I don't even know what that would mean. Added to what? To the syntax tree? The tree is an implementation detail and the spec surely does not prescribe how the compiler has to structure its data internally. – Dumond 9/11, 2013 at 16:16

@Dumond It can be distinguished; in some cases (as in the comments above) failing to add parentheses will lead to a program that doesn't compile. – Typhoid 9/11, 2013 at 16:57

@Dumond I'm not quite sure of the terminology, but I believe the translation does apply to the syntax-tree. For example, the problem translation is supposed to take a query-expression node matching certain criteria and translate it into a parenthesized-expression, containing the original query-expression node's from-clause child node's expression child-node. – Typhoid 9/11, 2013 at 16:58

@Typhoid if you assume that substituting e into a different place works like a string insertion, then parentheses are necessary. A more natural assumption would be that e is a value that can be inserted anywhere.; Does the spec assign different semantics to a parenthesized-expression? In other words, can the expression a mean something different than (a)? I don't think so. That makes it indistinguishable. (As I'm talking about expressions here, the associativity issue does not apply.) – Dumond 9/11, 2013 at 17:22

@Dumond I think you've hit the crux of the matter :) The "spec" part of the spec seems to be treating it as string replacement, and so is adding parentheses to cover the difference in associativity between query syntax and method syntax. The "example" part of the spec seems to be treating it as parsed syntax, so isn't adding parentheses where they're not necessary. (I wish there was an example where the e had an addition in it...) I'm not even sure what Roslyn does - maybe I'll try it with an addition and see whether it adds brackets where they're necessary! – Typhoid 9/11, 2013 at 17:43

In the example the construct 'e' is an expression and the construct '( e )' represents a primary. That is, there is a production in the C# grammar that allows '( e )' to be used anywhere a primary is expected. There is also a production that allows a primary to be used wherever an expression is expected.

In the 'from' code fragment an expression 'e' is required (as per the C# grammar), and in the 'Select()' fragment a primary is required, represented here as '( e )'.

The author of the example (perhaps unwisely) chose a primary 'customers' rather than an expression to illustrate the point. If the example had used an expression instead then the translation would have contained the parentheses. The example is correct, but borderline misleading.

In answer to your question, you can recognise whether the parentheses are necessary by recognising whether you are dealing with a primary or an expression. For a primary, they are not.

Disclosure: my expertise is compiler technology, C# syntax and Reflection.Emit but not (yet) Roslyn. I couldn't find any online docs, so I can't tell you how to do that in Roslyn.

Cannon answered 21/1, 2014 at 13:33 Comment(1)

Hi David, thanks to this I've found the "primary expressions" section in the C# spec which does list a load of expressions which I believe I could definitely remove the parentheses from :D I'm not sure if there's a nice Expression.IsPrimary in Roslyn I can use, but I can definitely work with the list. – Typhoid 21/1, 2014 at 13:52

Your translation with the additional (and ) is correct. But if you want to eliminate redundant parentheses, you can call

.WithAdditionalAnnotations(CodeAnnotations.Simplify)

on your resulting SyntaxNode.

Klansman answered 13/11, 2013 at 10:11 Comment(3)

What effect is this supposed to have? I've tried it on the result of parsing a simple (Enumerable.Range(1, 5)).Select(x => x*x) and it doesn't appear to do anything. The parentheses are still in the tree itself an in the output of e.g. ToFullString(). – Typhoid 13/11, 2013 at 13:9

@Typhoid Hmmm. In a CodeAction runnng in Visual Studio, the effect is a.o. that redundant parenthesis are removed. If you are using Rolsyn in your own app, I assume you need to call some Roslyn method to do the actual simplification, but I have no idea what method that is, or even if it is available. It may be a Visual Studio feature. – Klansman 13/11, 2013 at 15:25

Huh I'll have to try that. I've found a way to do it in code but it's horrible :D – Typhoid 13/11, 2013 at 16:9

Recommended topics

Hot tags