How do Java, C++, C#, etc. get around this particular syntactic ambiguity with < and >?

About

Asked 19/1, 2013 at 8:29 Answered 19/1, 2013 at 16:25

Solved java parsing grammar context-free-grammar lr-grammar

I used to think C++ was the "weird" one with all the ambiguities with < and >, but after trying to implement a parser I think I found an example which breaks just about every language that uses < and > for generic types:

f(g<h, i>(j));

This could be syntactically either interpreted as a generic method call (g), or it could be interpreted as giving f the results of two comparisons.

How do such languages (especially Java, which I thought was supposed to be LALR(1)-parsable?) get around this syntactic ambiguity?

I just can't imagine any non-hacky/context-free way of dealing with this, and I'm baffled at how any such language can be context-free, let alone LALR(1)-parsable...

(It's worth noting that even a GLR parser can't return a single parse for this statement with no context!!)

Renayrenckens answered 19/1, 2013 at 8:29 Comment(10)

+1 great question, Just a note, the java compiler has plenty of heuristics to resolve ambiguity, it guesses plenty. They themselves admit that their generic system is broken and there are cases it fails (although they are rate), look at the java 8 features presentation they explain some of the problems and how they'll address them in java 8. – Highpitched 19/1, 2013 at 8:35

@BenjaminGruenbaum: Thanks. :) Regarding heuristics -- they're definitely great for generating good compiler error messages, but unless they're part of the language spec's grammar, they can't be legally used to resolve ambiguities, can they? And if they are part of the grammar, then how can the grammar be context-free (let alone LR(1)-parsable)? – Renayrenckens 19/1, 2013 at 8:36

its not amibugous, it depends on what g is declared as – Hesiod 19/1, 2013 at 8:43

@NimChimpsky: Yeah, that's the point -- it's syntactically ambiguous (since it depends on the context), but not semantically ambiguous (since, given the context, it's unambiguous). – Renayrenckens 19/1, 2013 at 8:44

Could this be part of why scala went with []? – Postaxial 19/1, 2013 at 8:52

@KarthikT: Huh, maybe... I don't know Scala well at all unfortunately but that sounds very plausible. – Renayrenckens 19/1, 2013 at 8:54

@NimChimpsky: Also, I just noticed, your name is surprisingly similar to Chomsky! :) – Renayrenckens 19/1, 2013 at 9:4

@Mehrdad lol, its the name of a talking monkey, named after noam. – Hesiod 19/1, 2013 at 9:9

@Mehrdad. It's definitely why D went with !(). dlang.org/templates-revisited.html ("D solves it by noticing that ! is not used as a binary operator") – Ikkela 20/1, 2013 at 1:17

The point of GLR is to give you all the valid syntactic parses of your context free language. It would have no trouble parsing your example, and would produce both parses. To use this result, you'd have to decide which of the ambiguous choices made sense, using something other than this particular form (e.g., actual type information about "g" is). That's how we solve with our tools, and it nicely isolates parsing from name and type resolution. As other have noted, Java was LALR(1) in a galaxy long, long ago, and far far away. – Uproot 5/6, 2013 at 5:24

a generic method call in java would be <h,i>g(j) so there is no ambiguity :)

Annates answered 19/1, 2013 at 16:25 Comment(2)

Wow, you're right... I thought it would be g<h, i>(j) but apparently it isn't! +1 Great answer, thanks :) – Renayrenckens 19/1, 2013 at 19:11

That was my reaction, too. I think it make it easier for the parsers, and I have to admit it is a really ugly hack. Langauge designers shouldn't listen to the compiler guys complaining when they design; you get stuff like this. – Uproot 5/6, 2013 at 5:27

I just can't imagine any non-hacky/context-free way of dealing with this, and I'm baffled at how any such language can be context-free, let alone LALR(1)-parsable...

The answer is that they aren't (at least not Java and C++; I know very little about C#). The Java grammar that you link to dates back to 1996, way before generics have been introduced.

For further discussion, see Are C# and Java Grammars LALR(x)?

Irremediable answered 19/1, 2013 at 9:13 Comment(1)

Not even C is fully context-free since the use of a symbol can represent either an identifier or a type name, depending on declarations that can be separated from the use by an arbitrary amount of code. – Rigdon 23/1, 2013 at 18:18

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags