I'm about to make a parser for a fictional subset of XML with the following EBNF grammar:
DOCUMENT ::= ELEMENT
ELEMENT ::= START_TAG (ELEMENT | DATA)* END_TAG | EMPTY_TAG
START_TAG ::= < NAME ATTRIBUTE* >
END_TAG ::= </ NAME >
EMPTY_TAG ::= < NAME ATTRIBUTE* />
ATTRIBUTE ::= NAME = STRING
The above is the grammar 'as is', without any changes. Here is my attempt at converting it to LL(1):
DOCUMENT ::= ELEMENT EOF
ELEMENT ::= PREFIX > ELEMENT_OR_DATA END_TAG
| PREFIX />
PREFIX ::= < NAME OPT_ATTR
ELEMENT_OR_DATA ::= OPT_ELEMENT ELEMENT_OR_DATA
| OPT_DATA ELEMENT_OR_DATA
| epsilon
OPT_ELEMENT ::= ELM_LIST | epsilon
ELM_LIST ::= ELEMENT | ELEMENT ELM_LIST
OPT_DATA ::= DATA_LIST | epsilon
DATA_LIST ::= DATA | DATA DATA_LIST
END_TAG ::= </ NAME >
OPT_ATTR ::= ATTR_LIST | epsilon
ATTR_LIST ::= ATTRIBUTE | ATTRIBUTE ATTR_LIST
ATTRIBUTE ::= NAME = STRING
EOF ::= &$
Is this an LL(1) version of the original? If not, where did I go wrong? And if so, is there any way to 'simplify' without changing the meaning? I'm not convinced I have the simplest possible version.
Hope this is clear.