Limitations of LL vs LR parsers?
Asked Answered
T

7

17

I know the basic differences of LL vs LR parsers. I also know that GLR, SLR, and LALR are all extensions of LR parsers. So my question in more details is...

Given a LL(*) parser and any variation on a LR parser, is there any language that can be described in one and not the other? Or more simply is there any feature or property that can not be expressed with either?

As a concrete example. If I were to create a language using an LL(*) parser, will I ever run into desired feature/property that I might want to add to my language that would only be possible with a LR parser (or vice versa)?

Tedmann answered 29/3, 2011 at 2:30 Comment(0)
N
7

Sure. LL parsers cannot handle any grammars with left recursion. L(AL)R(k) parsers for fixed k won't be able to parse some things that an LL(*) parser can handle, because k<*.

Nim answered 29/3, 2011 at 2:33 Comment(5)
Can't you rewrite your rules to remove left recursion; I notice the wiki page for LL parsers mentions removal of left conflicts.Tedmann
Are you claiming that any grammar with left recursion cannot be rewritten for LL* such that the left recursion is eliminated?Paripinnate
Does it matter that LL parsers cannot handle grammars with left recursion? I have this vague memory that any language which can be expressed with left recursion can also be expressed with right recursion. IOW: you might have to factor your grammar differently when using an LL parser vs. an LR parser, but the question wasn't about whether or not you need to rewrite your grammar, it was about whether or not there exists a language which can be recognized by an LL parser but not an LR parser and vice versa.Renaud
If you are willing to rewrite the grammar, you can make most parsers "parse" a superset of the real langauge'; you can then write additional tests to remove the extraneous parts. The question is, how much work are you willing to go to? The OP's original question was about the strength of LL(*) vs. LALR(k).Nim
Well, Early algorithms solved the problem of context-free parsing long ago, just not very efficiently in practice. GLR solved the problem in practical sense quite awhile back. (I have a huge toolsuite that parses all kinds of stuff including C++11 using GLR, see my bio). GLL is an interesting recent contender that I haven't investigated much. ... what specific advantages over GLR do you think it has?Nim
H
13

Here are a couple of opinion pieces, you can consider them point and counterpoint:

Hose answered 29/3, 2011 at 2:39 Comment(1)
I am not very concerned with people's thoughts on the parsers as much as I am the theoretical limitations each has.Tedmann
N
7

Sure. LL parsers cannot handle any grammars with left recursion. L(AL)R(k) parsers for fixed k won't be able to parse some things that an LL(*) parser can handle, because k<*.

Nim answered 29/3, 2011 at 2:33 Comment(5)
Can't you rewrite your rules to remove left recursion; I notice the wiki page for LL parsers mentions removal of left conflicts.Tedmann
Are you claiming that any grammar with left recursion cannot be rewritten for LL* such that the left recursion is eliminated?Paripinnate
Does it matter that LL parsers cannot handle grammars with left recursion? I have this vague memory that any language which can be expressed with left recursion can also be expressed with right recursion. IOW: you might have to factor your grammar differently when using an LL parser vs. an LR parser, but the question wasn't about whether or not you need to rewrite your grammar, it was about whether or not there exists a language which can be recognized by an LL parser but not an LR parser and vice versa.Renaud
If you are willing to rewrite the grammar, you can make most parsers "parse" a superset of the real langauge'; you can then write additional tests to remove the extraneous parts. The question is, how much work are you willing to go to? The OP's original question was about the strength of LL(*) vs. LALR(k).Nim
Well, Early algorithms solved the problem of context-free parsing long ago, just not very efficiently in practice. GLR solved the problem in practical sense quite awhile back. (I have a huge toolsuite that parses all kinds of stuff including C++11 using GLR, see my bio). GLL is an interesting recent contender that I haven't investigated much. ... what specific advantages over GLR do you think it has?Nim
R
6

You might find interesting this paragraph in Wikipedia, which says, that LL(*) grammars are subset of LR(k) grammars: http://en.wikipedia.org/wiki/Context-free_grammar#Restrictions So you can parse more languages using LR parsing methods.

Rust answered 2/4, 2011 at 4:58 Comment(1)
I wish the paragraph had cited its sources; That would have been a good read. This seems to contradict Ira's answer (which I have accepted for now).Tedmann
A
6

There are some grammars that cannot be "rewritten" to be parsed by LL parsers which can be parsed by LR parsers. One example: Given a simple grammar that constructs terms with substractions:

S -> S - S | num

You obviously have left-recursion here, which cannot be handled by LL-parsers. To make this grammar parsable by LL, you have to eliminate the left-recursion:

S -> num S'

S' -> - num S' | epsilon

Now your LL parser can handle this grammar. But when building up your syntax tree for a term like 4 - 2 -1, a depth-first-search operated on the tree will give you 4 - (2 - 1) = 3 instead of (4 - 2) - 1 = 3 as you would expect.

The reason for this is, that you have to use left-recursive rules in your grammar to handle left-associative operators (like substract). But left-recursive rules cannot be handled by LL-parsers.

So here you have a class of languages that cannot be handled by LL.

Aculeus answered 4/5, 2011 at 12:16 Comment(4)
you only talk about LL here when my question was more focused on LL vs LR. Granted there limitations to each but do the limitations with LL also have an equivalence limitation with LR?Tedmann
It's good to make this salient. But your "here you have a class of languages that cannot be handled by LL" isn't the right lesson. The language is a set of strings, and here the intended language does contain the string "4-2-1" and is accepted by the parser. The lesson is rather that if you want a left-associative interpretation of that string, you'll have to do more work in how you handle the parser's output.Subsist
It may also be worth making explicit that if we extend your grammar with parentheses, the LL parser won't give 4-2-1 and 4-(2-1) the same AST, so these strings can be given different interpretations. What is true is that the structure of the AST for 4-2-1 will have the shape [- 4 [- 2 1]]. So, as I said, some work is needed to give that the desired left-associative interpretation.Subsist
@dubiousjim, i suppose the purpose of a parser is to parse a string, not to "accept" it.Clot
A
5

LR parser can accept bigger class of languages than LL. In LL(k) and LR(k), k means number of lookahead symbols that it needs to know so it can apply appropriate production/reduction. The bigger k get, the bigger are parsing tables. So k is not limiting only LR parsers, it is limiting LL parsers, too. The reason why LR parser can accept bigger class of languages is because of left recursion that are problematic when working with LL parsers. But this is not true entirely, because direct recursion are solvable, meaning that you can rewrite the grammar to be LL grammar. Direct recursion is something like A -> Abc. When you have indirect recursion, for which you now probably know how it looks, then you have a problem. LR parsers can solve this issues because they make parse tree in bottom-up manner. You will have to look little bit deeper into LR parsing to see why is that exactly. But, LR parsers are not all mighty, they have limitations, too. Some grammars are very difficult to digest and k factor does not help. For this kind of grammars GLR parser is needed, which actually simulates LR(k) parser but uses backtracking to analyze entire parse space when production/reduction ambiguity occurs.

Armure answered 21/9, 2011 at 5:21 Comment(0)
T
1

An LR (1) parser recognizes more grammars than an LL (1) because it doesn't need us to rewrite the grammar to remove left recursion and common factors.

However, given a unambiguous grammar, we can always remove left recursion and common factors, and so in the end they can both parse the same languages.

If anyone has doubts about this, I challenge you to give an unambiguous grammar that an LL(1) cannot parse and an LR (1) can, that is, that cannot be rewritten to remove left recursion and factors common

Thermoelectricity answered 19/4, 2021 at 9:11 Comment(0)
W
-1

LL parsing is theoretically O(n^4), or pretty dang slow. LR parsing is faster, O(n^3), or pretty slow. https://en.wikipedia.org/wiki/Top-down_parsing

Although I'd love to see proof of this.

Wolfram answered 14/2, 2018 at 23:24 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.