I am writing a Custom Language plugin for IntelliJ.
Here is a simplified example of the language. Note that the structure is recursive:
I have successfully implemented the FLEX and BNF files, but I'm not sure how to add error recovery. I've read about RecoverWhile and pin in Grammar-Kit's HOWTO, but I'm not sure how to apply them to my scenario.
I call the brown items above ("aaa", "ccc", etc...) "items".
I call the yellow ones ("bbb", "ddd", ...) "properties".
Each item has an item name (e.g. "aaa"), a single property (e.g. "bbb"), and can contain other items (e.g. "aaa" contains "ccc", "eeee", and "gg").
At the moment, the plugin doesn't behave well when an item is malformed. For example:
In this example, I would like the parser to "understand" that "ccc" is the name of an item with a missing property (e.g. by detecting a newline before the closing bracket).
I don't want the broken "ccc" item to influence the parsing of "eeee" (but I do want the PSI tree to have the elements of "ccc" that are present in the text, in this case - its name).
Here are the FLEX and BNF that I use:
FLEX:
CRLF= \n|\r|\r\n
WS=[\ \t\f]
WORD=[a-zA-Z0-9_#\-]+
%state EOF
%%
<YYINITIAL> {WORD} { yybegin(YYINITIAL); return MyLangTypes.TYPE_FLEX_WORD; }
<YYINITIAL> \[ { yybegin(YYINITIAL); return MyLangTypes.TYPE_FLEX_OPEN_SQUARE_BRACKET; }
<YYINITIAL> \] { yybegin(YYINITIAL); return MyLangTypes.TYPE_FLEX_CLOSE_SQUARE_BRACKET; }
<YYINITIAL> \{ { yybegin(YYINITIAL); return MyLangTypes.TYPE_FLEX_OPEN_CURLY_BRACKET; }
<YYINITIAL> \} { yybegin(YYINITIAL); return MyLangTypes.TYPE_FLEX_CLOSE_CURLY_BRACKET; }
({CRLF}|{WS})+ { return TokenType.WHITE_SPACE; }
{WS}+ { return TokenType.WHITE_SPACE; }
. { return TokenType.BAD_CHARACTER; }
BNF:
myLangFile ::= (item|COMMENT|CRLF)
item ::=
itemName
(TYPE_FLEX_OPEN_SQUARE_BRACKET itemProperty? TYPE_FLEX_CLOSE_SQUARE_BRACKET?)?
itemBody?
itemName ::= TYPE_FLEX_WORD
itemProperty ::= TYPE_FLEX_WORD
itemBody ::= TYPE_FLEX_OPEN_CURLY_BRACKET item* TYPE_FLEX_CLOSE_CURLY_BRACKET