Shift/Reduce conflict in Bison, when trying to add optional rule
Asked Answered
B

1

1

I am trying to solve shift/reduce conflict in Bison. I have following grammar rules

new_expr:
    T_NEW class_name_reference optional_generics_list ctor_arguments
        { $$ = zend_ast_create(ZEND_AST_NEW, $2, $4, $3); }
|   T_NEW anonymous_class
        { $$ = $2; }

optional_generics_list:
    /* empty */     { $$ = NULL; }
|   generics_list   { $$ = $1; }

ctor_arguments:
    /* empty */ { $$ = zend_ast_create_list(0, ZEND_AST_ARG_LIST); }
|   argument_list { $$ = $1; }

The problem here lies in that fact, that both optional_generics_list and ctor_arguments can be empty. How can I specify (if I could) that if both optional_generics_list and ctor_arguments are empty, then ctor_arguments should have higher priority. Or maybe my question is not correct, and how can solve this conflict instead.

Some updated info: Maybe output of generated .output file will help:

    State 156 conflicts: 1 shift/reduce

State 156:

  303 new_expr: "new (T_NEW)" class_name_reference . optional_generics_list ctor_arguments

    '<'  shift, and go to state 304

    '<'       [reduce using rule 168 (optional_generics_list)]
    $default  reduce using rule 168 (optional_generics_list)

    optional_generics_list  go to state 305
    generics_list           go to state 306


State 305

  303 new_expr: "new (T_NEW)" class_name_reference optional_generics_list . ctor_arguments

    '('  shift, and go to state 229

    $default  reduce using rule 405 (ctor_arguments)

    argument_list   go to state 546
    ctor_arguments  go to state 552


State 306

  169 optional_generics_list: generics_list .

    $default  reduce using rule 169 (optional_generics_list)
Bustup answered 21/1, 2017 at 22:47 Comment(7)
Do generics_list and argument_list look the same? Or, more precisely, is there some terminal which could be the start of either one?Fourscore
Alternatively, what do you mean by one empty list having "higher priority" than the other? If they are both empty, they are both empty, no?Fourscore
generics_list can start with '<' symbol, while ctor_arguments can start with '('. The problem lies in empty rule, because when I use generics_list instead of optional_generics_list and argument_list instead of ctor_arguments, then everything works fine, it seems.Bustup
Then you are not providing enough information to respond to your question. Is it the case that generics_list must start with <? Or can it be empty? (If it can be empty, it's obvious that the grammar is ambiguous.) Although I suspect the problem is with what can follow a new_expr: if the next token could be a (, for example, it won't be clear to the parser if that parenthesis belongs to an argument_list or not. (And then it has nothing to do with generics).Fourscore
Updated my question - attached some info from .output file, generated by bison. Also tried to replace in rule for new_expr optional_generics_list with generics_list and ctor_arguments with argument_list to get rid of empty rules, and in that case bison does not show any shift/reduce conflict. So I believe, that problem is in /* empty */ section of the rules.Bustup
OK; I did my best with the information available. I hope it helps. (Usually it helps to have a grammar which exhibits the problem, preferably reduced to a minimum (minimal reproducible example), but I'll trust my crystal ball this time.)Fourscore
Ok, thank You for an answerBustup
F
0

The problem indicated by the bison output file is that it is possible that a new_expr could be followed by an < which is not part of a generics_list. That would be the case if, for example, the grammar also contained productions like:

term: new_expr
comparison: term '<' term

(Of course, one would expect a real grammar to have a lot more possibilities than that, but those are the essential ones.)

In other words, if your grammar allows you to compare two newly-constructed objects, then the parser cannot tell whether the < that it sees is the start of a generics_list, or a simple comparison operator after a new_expr where both the generics_list and the ctor_arguments have been omitted:

if new Foo < oldFoo then...

myFoo = new Foo<int>(42)

The simplest fix would be to insist that a new_expr be parenthesized if it is going to be used in an expression.

For what it's worth, C++ handles this by knowing whether names are templated or not. If a name can take template arguments, then a < following the name is interpreted as starting the template argument list; otherwise it's a less-than operator. So if v is templated, then you have to write (new v) < ..., but if v is just a simple typename, you can leave out the parentheses: new int < .... implementing that is tricky; you need some kind of lexical feedback, and you need to impose some restrictions on where template declarations can be placed. C++ has some other unique parsing challenges with similar resolutions. For example, new int * i is an error because the * is parsed as being a pointer type modifier, using a rule which says that the type in a new expression is the longest sequence of tokens parseable as a type.

I'd go for mandatory parentheses when new is used as the left argument of any operator, since it's less confusing for the reader. It also simplifies the grammar, and that's a Good Thing because grammars are not just for parsing; they are an essential part of the documentation of the language, and unnecessarily complicated grammars make the language unnecessarily hard to learn and understand.

Interestingly, in the course of writing the above note about C++, I discovered a bug in the parser of one major C++ compiler. (At least, I think I know which compiler is buggy; it would be more accurate to say that I discovered that two popular compilers have inconsistent behaviour, so one of them must be wrong.) A simpler rule would have had no effect on any non-contrived program, and would have made the bug much less likely (and much easier to verify). So it is not at all clear to me the value added of permitting unparenthesized new operations in the middle of an expression.

Fourscore answered 21/1, 2017 at 23:18 Comment(1)
Verified it, and indeed problem was in that fact, that new_expr can be part of an comparison expression. Thanks a lot.Bustup

© 2022 - 2024 — McMap. All rights reserved.