Real-world LR(k > 1) grammars? - McMap

About

Real-world LR(k > 1) grammars?

Asked 26/11, 2013 at 2:18 Answered 26/11, 2013 at 4:50

Solved parsing grammar context-free-grammar lr-grammar

D

1

3

Making artificial LR(k) grammars for k > 1 is easy:

Input: A1 B x
Input: A2 B y                   (introduce reduce-reduce conflict for terminal a)
A1   : a
A2   : a
B    : b b b ... b              (terminal b occurs k-1 times)

However, are there any real-world non-LR(1) computer languages that are LR(k > 1)-parsable?
Or are non-LR(1) languages also not LR(k) either?

Discriminate answered 26/11, 2013 at 2:18 Comment(0)

A

7

If a language has an LR(k) grammar, then it has an LR(1) grammar which can be generated mechanically from the LR(k) grammar; furthermore, the original parse tree can be recreated from the LR(1) parse tree. A proof of this theorem appears in Section 6.7 of Parsing Theory Vol. II, Sippu & Soisalon-Soininen; they attribute it to "On the complete covering problem for LR(k) grammars" (1976) by MD Mickunas, in JACM 23:17-30.

So there's no language parseable as LR(k) for k>1 which is not also parseable as LR(1), and consequently there really isn't such as thing as an LR(k) language for values of k other than 0 and 1.

However, there are languages for which an LR(2) grammar is much easier. One classic example is the Posix grammar for yacc, which does not require productions to be terminated with a ;. This is unambiguous, because a production must start with an identifier followed by a colon. Written that way, the grammar is clearly LR(2); by the above theorem, an LR(1) grammar exists but it's nowhere near as clean.

Alexander answered 26/11, 2013 at 4:50 Comment(12)

Interesting, I knew you can generate the LR(1) grammar mechanically, but I didn't know you also recover the parse tree mechanically! +1 – Discriminate 26/11, 2013 at 4:57

Do you know if there is a simpler description of how to recover the full parse tree somewhere (aside from the paper/book you linked to)? Transforming the grammar is probably doable, but I can't imagine how you can recover the parse tree without effectively making a general LR(k) parser in the process... – Discriminate 26/11, 2013 at 5:0

@mehrdad: I have a copy of Sippu&Soisalen-Soininen in front of me, but that doesn't help you much :( . It does, however, mean that I've never bothered to find another source for the theorem. Do you have an academic library nearby, perhaps? – Alexander 26/11, 2013 at 5:5

Ahh haha I see. Yes, good idea, I'll go check it sometime :) thanks a bunch! – Discriminate 26/11, 2013 at 5:7

Here's a quick quote: "The idea... is to "shift" the derivation trees in G k symbols to the right; in the parser, this will mean that reduce actions are postponed until 1-symbol lookahead is sufficient... To achieve this, any nonterminal A of G is replaced by a set of nonterminals of the form (x, A, y) where y is a string in FOLLOWk(A) and x is a string in FIRSTk(A)." I think that you can probably work out the transformation from that... – Alexander 26/11, 2013 at 5:7

Thanks -- just to check, is this the algorithm for converting the LR(k) grammar into an LR(1) grammar, or is it the algorithm for recovering the LR(k) tree from the LR(1) tree? – Discriminate 26/11, 2013 at 5:17

@Mehrdad: It's the algorithm for making the LR(1) grammar. But from the transformed reduction, you can recover the original reduction (for the nonterminal A, k symbols in the past), and you can just replay the original reductions to build the original parse tree. – Alexander 26/11, 2013 at 5:22

Oh, the part that confused me wasn't transforming the grammar -- it was recovering the parse tree. I'm not sure what you mean by "replay the original reductions" here -- if we couldn't perform the reductions during the first pass, what makes it easier to do so in the second pass? (Don't we encounter the same ambiguities as before?) – Discriminate 26/11, 2013 at 5:25

@Mehrdad: Maybe "replay" wasn't the right word. There's no second pass. Every time you do a reduction (in the cover LR(1) grammar), if that reduction corresponds to a reduction in the original grammar, you "do" the original reduction to build the original tree (so you don't actually build the LR(1) tree; you just use the reduction actions.) There are no ambiguities because the reductions are effectively delayed k symbols (and the context is accumulated into the non-terminal itself). – Alexander 26/11, 2013 at 5:28

Oooh interesting... I think that's starting to make sense, thanks :) – Discriminate 26/11, 2013 at 5:32

@Mehrdad. My quote was a bit misleading; you actually need {x, V, y} where V is a terminal or original non-terminal. You end up with productions of the form (y0, A, ym) -> (y0, X1, y1) (y1, X2, y2) ... (ym-1 Xm ym); when you do that reduction, you actually apply the reduction A -> X1 X2...Xm to the parse tree you are building. You also end up with rules of the form (ax, a, xb) -> b where xb is in FOLLOWk(a); those productions don't correspond to any reduction. – Alexander 26/11, 2013 at 5:35

Haha okay, I'll have to read it carefully again sometime later... LR parsing itself took me a while to understand, so I can only imagine extending it will be harder :) – Discriminate 26/11, 2013 at 5:40

Recommended topics

#Godot #Unity #Godot 4.X #Mongodb

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

© 2022 - 2024 — McMap. All rights reserved.