The lark parser predefines some common terminals, including a string. It is defined as follows:
_STRING_INNER: /.*?/
_STRING_ESC_INNER: _STRING_INNER /(?<!\\)(\\\\)*?/
ESCAPED_STRING : "\"" _STRING_ESC_INNER "\""
I do understand _STRING_INNER
. I also understand how ESCAPED_STRING
is composed. But what I don't really understand is _STRING_ESC_INNER
.
If I read the regex correctly, all it says is that whenever I find two consecutive literal backslashes, they must not be preceeded by another literal backslash?
How can I combine those two into a single regex?
And wouldn't it be required for the grammar to only allow escaped double quotes in the string data?
_STRING_INNER
have two backslashes? in/.*?/
? – Apostate