Remove Ambiguity in abstract syntax in other to write DCG parser Prolog
Asked Answered
C

2

5

P => Program K => Block

S => Single-command

C => Commands

E => Expression

B => Boolean-expr

I => Identifier

N > Numeral

P ::= K.

K ::= begin C end

C ::= C1 ; C2 | S

S ::= I := E | if (B) then S | if (B) then S1 else S2 | while (B) do S | repeat C until (B) | K | print E

E ::= − E | E1 + E2 | E1 − E2 | E1 E2 | E1 div E2 | E1 mod E2 | (E) | I | N

B ::= E1 = E2 | E1 > E2 | E1 < E2 | E1 != E2 | not B | B1 and B2 | B1 or B2 | (B)

I am supposed to remove ambiguities in E and B so that I can write a DCG parser in prolog.

Celluloid answered 30/10, 2012 at 0:33 Comment(0)
W
5

Prolog evaluates top down, then LL grammar techniques can be adapted. DCG are more powerful than LL(1), but left recursion must still be eliminated.

B is easier to handle: left factor the production.

B ::= E Bx | not B | (B)
Bx ::= = E | > E | < E | != E | and B | or B

E is harder, because the miss mul token introduces still more ambiguity. Tentatively

E ::= − E | (E) | I | N | El
El ::= Ex E | epsilon
Ex ::= + El | − El | div El | mod El

epsilon (empty production) in DCG can be written this way

epsilon --> [].

If you need to handle precedence and associativity (in both B and E) you will need more work. You can refer to this older answer for a working schema.

Woodruff answered 30/10, 2012 at 8:37 Comment(0)
B
3

@chac already gave you quite a good answer, showing you the usual way to resolve this.

Let me take another way to read your question: You are "supposed to remove ambiguities in E and B so that" you "can write a DCG parser in Prolog". That means, you need to remove ambiguity only that far that you can write a DCG parser in Prolog. There is good news: You do not need to remove any ambiguities at all to write a DCG parser! Here is how:

The source of ambiguity are productions like

C ::= C ; C

or the other operators + - juxtapositioning div mod and

Let me stick to a simplified grammar:

E ::= E + E | "1"

We could encode this as

e --> "1".
e --> e, "+", e.

Unfortunately, Prolog does not terminate for a query like

?- L = "1+1+1", phrase(e,L).
   L = "1+1+1"
;  resource_error(_). % ERROR: Out of local stack

Actually, it terminates, but only because my computer's memory is finite...

Not even for:

?- L = "1", phrase(e,L).
   L = "1"
;  resource_error(_). % ERROR: Out of local stack

Is this a problem of ambiguity? No! It is just a procedural problem of Prolog which cannot handle left-recursions directly. Here is a way to make Prolog handle it:

e([_|S],S) --> "1".
e([_|S0],S) --> e(S0,S1), "+", e(S1,S).

?- L = "1+1+1", phrase(e(L,[]),L).
   L = "1+1+1"
;  L = "1+1+1"
;  false.
?- L = "1", phrase(e(L,[]),L).
   L = "1"
;  false.

For the moment we have only defined a grammar, most of the times you are also interested to see the corresponding syntax tree:

e(integer(1), [_|S],S) --> "1".
e(plus(L,R), [_|S0],S) --> e(L, S0,S1), "+", e(R, S1,S).

?- L = "1+1+1", phrase(e(Tree, L,[]),L).
   L = "1+1+1", Tree = plus(integer(1),plus(integer(1),integer(1)))
;  L = "1+1+1", Tree = plus(plus(integer(1),integer(1)),integer(1))
;  false.

Now, we see that there is an ambiguity with plus! Your original grammar both accepted it as (1+1)+1 and 1+(1+1) which itself is not a problem as long as the corresponding semantics guarantees that associativity is observed. Most of the time this is disambiguated to be left-associative, thus meaning (1+1)+1, but this is not the case for all infix operators.

Branch answered 30/10, 2012 at 16:44 Comment(2)
What a nice trick to bypass the left recursion limitation! It's not easy to understand...How much general is it?Woodruff
@chac: It is an additional difference over the length of the list. You can even optimize that a little bit further by moving to the left (I do that rarely, because it saves some runtime, but increases my thinking time significantly). As to its generality: That is fine, but it is not very realistic for larger input - it gets very costly. However, sometimes it is possible to enforce semidet mode for "relevant chunks".Branch

© 2022 - 2024 — McMap. All rights reserved.