LL(1) to LR(1) Transformation
Asked Answered
C

0

7

I recently finished writing a recursive-descent parser for this LL(1) grammar (which generates a tiny subset of XML):

document ::= element EOF
element ::= < elementPrefix
elementPrefix ::= NAME attribute elementSuffix
attribute ::= NAME = STRING attribute
attribute ::= EPSILON
elementSuffix ::= > elementOrData endTag
elementSuffix ::= />
elementOrData ::= < elementPrefix elementOrData
elementOrData ::= DATA elementOrData
elementOrData ::= EPSILON
endTag ::= </ NAME > 

I had converted the grammar to LL(1) from this simple EBNF grammar:

document  ::=  element
element   ::=  start_tag (element | DATA)* end_tag | empty_tag
start_tag ::=  < NAME attribute* >
end_tag   ::=  </ NAME >
empty_tag ::=  < NAME attribute* />
attribute ::=  NAME = STRING

Right now, I am in the process of writing a shift-reduce parser that recognizes the same grammar. I realize that every LL(1) grammar is also LR(1). However, my professor tells me that it 'probably wouldn't be convenient' to write a shift-reduce parser for the above LL(1) grammar. Which leads me to think that I need to transform it to LR(1) before starting to code the parser.

Assuming it indeed wouldn't be a good idea to code an LR(1) parser using the LL(1) grammar above, how can I transform it to LR(1)? What would I need to change to actually make it more suitable for a hand-coded LR(1) parser?

Addendum: The tokens are NAME, STRING, DATA, >, <, />, </, and =.

Update 11/3/16:

Evidently, transformation was not necessary. The grammar was already in LR(1) and I was able to confirm it after more research. I have now finished implementing both parsers, and I thank everyone who has been able to help!

Cheesy answered 24/10, 2016 at 20:37 Comment(11)
Why wouldn't it be convenient?Hienhieracosphinx
@IraBaxter So the grammar I have up there is appropriate to use for hand-coding a shift-reduce parser?Cheesy
"Every LL(1) grammar is an LR(1) grammar". Yes. For convenience, you might pull out the Kleene star subexpressions and make them into separate rules using recursion.Hienhieracosphinx
@IraBaxter Isn't this what I already did in the LL(1) grammar at the top of this post?Cheesy
Can you (or anybody else) explain how you want to hand-code a LR-Parser? Writing the LR-Items on a paper and then copying all transitions into a transition table? Then try to find a LR-grammar that produces minimal states. This needs much experience and a bit of luck - I think there is no method to do it systematically.Tortricid
@Tortricid I have no idea yet. The recursive descent parser took less than an hour to implement once I got the grammar down. I hope I will be as lucky with this one.Cheesy
Your grammar is already LR(1), so probably the question should be: "How can I transform an inconvenient LR(1)-grammar, to a convenient one". And there is no solution, especially because it is very uncommon to handcode a LR-Parser. The professor probably pointed to the fact that your EBNF-grammar is already LR(1) - it is much better readable and produces a much more convenient syntax tree.Tortricid
I recommend you to use left-recursive instead of right-recursive. Right-recursive rules may cause shift/reduce conflicts in LR parsers.Cadastre
It seems you want to finish your project simply getting help on each piece: #40012407 Probably, you should open a book and read some basics first.Goddamn
@Goddamn I have since finished coding both parsers. Thank you so much for your help throughout! :)Cheesy
@Cadastre Thank you for this note. I fixed my parser accordingly!Cheesy

© 2022 - 2024 — McMap. All rights reserved.