I recently finished writing a recursive-descent parser for this LL(1) grammar (which generates a tiny subset of XML):
document ::= element EOF
element ::= < elementPrefix
elementPrefix ::= NAME attribute elementSuffix
attribute ::= NAME = STRING attribute
attribute ::= EPSILON
elementSuffix ::= > elementOrData endTag
elementSuffix ::= />
elementOrData ::= < elementPrefix elementOrData
elementOrData ::= DATA elementOrData
elementOrData ::= EPSILON
endTag ::= </ NAME >
I had converted the grammar to LL(1) from this simple EBNF grammar:
document ::= element
element ::= start_tag (element | DATA)* end_tag | empty_tag
start_tag ::= < NAME attribute* >
end_tag ::= </ NAME >
empty_tag ::= < NAME attribute* />
attribute ::= NAME = STRING
Right now, I am in the process of writing a shift-reduce parser that recognizes the same grammar. I realize that every LL(1) grammar is also LR(1). However, my professor tells me that it 'probably wouldn't be convenient' to write a shift-reduce parser for the above LL(1) grammar. Which leads me to think that I need to transform it to LR(1) before starting to code the parser.
Assuming it indeed wouldn't be a good idea to code an LR(1) parser using the LL(1) grammar above, how can I transform it to LR(1)? What would I need to change to actually make it more suitable for a hand-coded LR(1) parser?
Addendum: The tokens are NAME
, STRING
, DATA
, >
, <
, />
, </
, and =
.
Update 11/3/16:
Evidently, transformation was not necessary. The grammar was already in LR(1) and I was able to confirm it after more research. I have now finished implementing both parsers, and I thank everyone who has been able to help!