Why is Parsimonious rejecting my input with an IncompleteParseError?

Traceback (most recent call last): File "tests.py", line 13, in <module> print(grammar.parse("{ do-something some-argument }")) File "/usr/local/lib/python2.7/dist-packages/parsimonious/grammar.py", line 112, in parse return self.default_rule.parse(text, pos=pos) File "/usr/local/lib/python2.7/dist-packages/parsimonious/expressions.py", line 109, in parse raise IncompleteParseError(text, node.end, self) parsimonious.exceptions.IncompleteParseError: Rule 'program' matched in its entirety, but it didn't consume all the text. The non-matching portion of the text begins with '{ do-something some-' (line 1, column 1).

I am very far from an expert on Parsimonious, but I believe the problem is that ~".+" is greedily matching the whole remainder of the input string, leaving nothing to match the rest of the production. I initially tested that idea by changing the regex for rvalue to ~"[a-z0-9\\-]+", same as the one you have for lvalue. Now it parses, and (awesomely) distinguishes by context between the two identically defined tokens lvalue and rvalue.

from parsimonious.grammar import Grammar

grammar = Grammar(
    """
    program = expr*
    expr    = _ "{" lvalue (rvalue / expr)* "}" _
    lvalue  = _ ~"[a-z0-9\\-]+" _
    rvalue  = _ ~"[a-z0-9\\-]+" _
    _       = ~"[\\n\\s]*"
    """
)

print(grammar.parse( "{ do-something some-argument }"))

If you mean for rvalue to match any sequence of non-whitespace characters, you want something more like this:

rvalue = _ ~"[^\\s\\n]+" _

But whoops!

{ foo bar }

"}" is a closing curly brace, but it's also a sequence of one or more non-whitespace characters. Is it "}" or rvalue? The grammar says the next token can be either of those. One of those interpretations is parsable and the other isn't, but Parsimonious just says it's spinach and the hell with it. I don't know if a parsing maven would consider that a legitimate way to resolve the ambiguity (e.g. maybe such a grammar may result in cases with two possible interpretations that both parse), or how practical that would be to implement. In any case Parsimonious doesn't make that call.

So we need to repel boarders on the curly brace issue. I think this grammar does what you want:

from parsimonious.grammar import Grammar

grammar = Grammar(
    """
    program = expr*
    expr    = _ "{" lvalue (expr / rvalue)* "}" _
    lvalue  = _ ~"[a-z0-9\\-]+" _
    rvalue  = _ ~"[^{}\\n\\s]+" _
    _       = ~"[\\n\\s]*"
    """
)

print(grammar.match( "{ do-something some-argument 23423 {foo bar} &^%$ }"))

I excluded open curly brace as well, because how would you expect this string to tokenize?

{foo bar{baz poo}}

I would expect

"{" "foo" "bar" "{" "baz" "poo" "}" "}"

...because if "poo}" is expected to tokenize as "poo" "}", and "{foo" is expected to tokenize as "{" "foo", then treating bar{baz as "bar{baz" or "bar{" "baz" is ~~deranged~~counterintuitive.

Now I remember how my bitter hatred of yacc drove me to an obsession with it.

Recommended topics

Hot tags