Why does this simple grammar have a shift/reduce conflict?
Asked Answered
D

4

5
%token <token> PLUS MINUS INT
%left PLUS MINUS

THIS WORKS:

exp : exp PLUS exp;
exp : exp MINUS exp;
exp : INT;

THIS HAS 2 SHIFT/REDUCE CONFLICTS:

exp : exp binaryop exp;
exp : INT;
binaryop: PLUS | MINUS ;

WHY?

Discriminant answered 15/3, 2012 at 9:23 Comment(0)
G
5

This is because the second is in fact ambiguous. So is the first grammar, but you resolved the ambiguity by adding %left.

This %left does not work in the second grammar, because associativity and precedence are not inherited from rule to rule. I.e. the binaryop nonterminal does not inherit any such thing even though it produces PLUS and MINUS. Associativity and predecence are localized to a rule, and revolve around terminal symbols.

We cannot do %left binaryop, but we can slightly refactor the grammar:

exp : exp binaryop term
exp : term;
term : INT;
binaryop: PLUS | MINUS ;

That has no conflicts now because it is implicitly left-associative. I.e. the production of a longer and longer expression can only happen on the left side of the binaryop, because the right side is a term which produces only an INT.

Golda answered 15/3, 2012 at 20:14 Comment(0)
E
4

You need to specify a precedence for the exp binop exp rule if you want the precedence rules to resolve the ambiguity:

exp : exp binaryop exp %prec PLUS;

With that change, all the conflicts are resolved.

Edit

The comments seem to indicate some confusion as to what the precedence rules in yacc/bison do.

The precedence rules are a way of semi-automatically resolving shift/reduce conflicts in the grammar. They're only semi-automatic in that you have to know what you are doing when you specify the precedences.

Bascially, whenever there is a shift/reduce conflict between a token to be shifted and a rule to be reduced, yacc compares the precedence of the token to be shifted and the rule to be reduced, and -- as long as both have assigned precedences -- does whichever is higher precedence. If either the token or the rule has no precedence assigned, then the conflict is reported to the user.

%left/%right/%nonassoc come into the picture when the token and rule have the SAME precedence. In that case %left means do the reduce, %right means do the shift, and %nonassoc means do neither, causing a syntax error at runtime if the parser runs into this case.

The precedence levels themselves are assigned to tokens with%left/%right/%nonassoc and to rules with %prec. The only oddness is that rules with no %prec and at least one terminal on the RHS get the precedence of the last terminal on the RHS. This can sometimes end up assigning precedences to rules that you really don't want to have precedence, which can sometimes result in hiding conflicts due to resolving them incorrectly. You can avoid these problems by adding an extra level of indirection in the rule in question -- change the problematic terminal on the RHS to to a new non-terminal that expands to just that terminal.

Echinoid answered 16/3, 2012 at 0:13 Comment(4)
If you take out the %left declaration and introduce only %nonassoc LOW and %nonassoc HIGH phony tokens, and then assign precedences to all rules of the grammar using either %prec LOW or %prec HIGH, you will not eliminate the s/r conflict no matter what combination of LOW and HIGH you use.Golda
I.e. the reason that the conflict goes away is not what it appears to be. If you have %left PLUS MINUS, then all kinds of %prec declarations suppress the conflict. Some of them appear to make the grammar right associative, by flipping between shifting or reducing in State 7.Golda
P.S. I have used a %prec hack in rules that have an empty production, like foo_list : foo_list foo %prec HIGH | /* empty */;. Rationale: there is no operator between foo_list and foo to which we can assign a precedence and the hack does the trick.Golda
The documentation for %prec: gnu.org/software/bison/manual/html_node/…Iffy
H
0

I assume that this falls under what the Bison manual calls "Mysterious Conflicts". You can replicate that with:

exp:   exp plus exp;
exp:   exp minus exp;
exp:   INT;
plus:  PLUS;
minus: MINUS;

which gives four S/R conflicts for me.

Harless answered 15/3, 2012 at 19:17 Comment(2)
No, those are reduce-reduce conflicts.Discriminant
Not to Bison: test2.y: conflicts: 4 shift/reduce.Harless
A
0

The output file describing the conflicted grammar produced by Bison (version 2.3) on Linux is as follows. The key information at the top is 'State 7 has conflicts'.

State 7 conflicts: 2 shift/reduce

Grammar

    0 $accept: exp $end

    1 exp: exp binaryop exp
    2    | INT

    3 binaryop: PLUS
    4         | MINUS

Terminals, with rules where they appear

$end (0) 0
error (256)
PLUS (258) 3
MINUS (259) 4
INT (260) 2

Nonterminals, with rules where they appear

$accept (6)
    on left: 0
exp (7)
    on left: 1 2, on right: 0 1
binaryop (8)
    on left: 3 4, on right: 1

state 0

    0 $accept: . exp $end

    INT  shift, and go to state 1

    exp  go to state 2

state 1

    2 exp: INT .

    $default  reduce using rule 2 (exp)

state 2

    0 $accept: exp . $end
    1 exp: exp . binaryop exp

    $end   shift, and go to state 3
    PLUS   shift, and go to state 4
    MINUS  shift, and go to state 5

    binaryop  go to state 6

state 3

    0 $accept: exp $end .

    $default  accept

state 4

    3 binaryop: PLUS .

    $default  reduce using rule 3 (binaryop)

state 5

    4 binaryop: MINUS .

    $default  reduce using rule 4 (binaryop)

state 6

    1 exp: exp binaryop . exp

    INT  shift, and go to state 1

    exp  go to state 7  

And here is the information about 'State 7':

state 7

    1 exp: exp . binaryop exp
    1    | exp binaryop exp .

    PLUS   shift, and go to state 4
    MINUS  shift, and go to state 5

    PLUS      [reduce using rule 1 (exp)]
    MINUS     [reduce using rule 1 (exp)]
    $default  reduce using rule 1 (exp)

    binaryop  go to state 6

The trouble is described by the . markers in the the lines marked 1. For some reason, the %left is not 'taking effect' as you'd expect, so Bison identifies a conflict when it has read exp PLUS exp and finds a PLUS or MINUS after it. In such cases, Bison (and Yacc) do the shift rather than the reduce. In this context, that seems to me to be tantamount to giving the rules right precedence.

Changing the %left to %right and omitting it do not change the result (in terms of the conflict warnings). I also tried Yacc on Solaris and it produce essentially the same conflict.

So, why does the first grammar work? Here's the output:

Grammar

    0 $accept: exp $end

    1 exp: exp PLUS exp
    2    | exp MINUS exp
    3    | INT

Terminals, with rules where they appear

$end (0) 0
error (256)
PLUS (258) 1
MINUS (259) 2
INT (260) 3

Nonterminals, with rules where they appear

$accept (6)
    on left: 0
exp (7)
    on left: 1 2 3, on right: 0 1 2

state 0

    0 $accept: . exp $end

    INT  shift, and go to state 1

    exp  go to state 2

state 1

    3 exp: INT .

    $default  reduce using rule 3 (exp)

state 2

    0 $accept: exp . $end
    1 exp: exp . PLUS exp
    2    | exp . MINUS exp

    $end   shift, and go to state 3
    PLUS   shift, and go to state 4
    MINUS  shift, and go to state 5

state 3

    0 $accept: exp $end .

    $default  accept

state 4

    1 exp: exp PLUS . exp

    INT  shift, and go to state 1

    exp  go to state 6

state 5

    2 exp: exp MINUS . exp

    INT  shift, and go to state 1

    exp  go to state 7

state 6

    1 exp: exp . PLUS exp
    1    | exp PLUS exp .
    2    | exp . MINUS exp

    $default  reduce using rule 1 (exp)

state 7

    1 exp: exp . PLUS exp
    2    | exp . MINUS exp
    2    | exp MINUS exp .

    $default  reduce using rule 2 (exp)

The difference seems to be that in states 6 and 7, it is able to distinguish what to do based on what comes next.

One way of fixing the problem is:

%token <token> PLUS MINUS INT
%left PLUS MINUS

%%

exp  : exp binaryop term;
exp  : term;
term : INT;
binaryop: PLUS | MINUS;
Aleutian answered 15/3, 2012 at 21:5 Comment(3)
The reason %left isn't taking effect is that its effect takes place only in rules where those tokens appear. PLUS and MINUS appear in binaryop. If there was a left/right ambiguity in binaryop, then this associativity would "kick in" to fix it. Once the PLUS and MINUS are reduced to binaryop, that symbol no longer has anything to do with PLUS and MINUS and their associativity. What yacc could benefit from would be an extension to allow nonterminals to be given associativity, so you could write %left binaryop.Golda
Alternatively, yacc could support some grammar expansion at parse time, such as (PLUS|MINUS). Or even macro-like rules which are unfolded at grammar parse time.Golda
@Kaz: that makes sense (precedence only working when there are alternatives in a single rule), but I don't recall seeing it called out before.Aleutian

© 2022 - 2024 — McMap. All rights reserved.