Prolog : Combining DCG grammars with other restrictions
Asked Answered
B

2

9

I'm very impressed by Prolog's DCG and how quickly I can produce all the possible structures that fit a particular grammar.

But I'd like to combine this search with other constraints. For example, define a complex grammar and ask Prolog to generate all sentences with not more than 10 words. Or all sentences that don't repeat the same word twice.

Is it possible to add extra constraints like this to a DCG grammer? Or do I basically have to translate the DCG back into normal Prolog clauses and start modifying them?

Billingsley answered 29/6, 2011 at 17:20 Comment(0)
B
9

If you only want to see all sentences that are generated, it is very convenient to use the following:

?- length(Xs, N), phrase(mynonterminal, Xs).

Of course that generates all sentences. But it is very useful and it saves you the time to think of a concrete limit. If you want to restrict that further, add the goal between(0,10,N) in front.

If you want to say within a grammar, that a certain non-terminal should take a certain length, it is best to say this explicitly:

seq([]) --> [].
seq([E|Es]) --> [E], seq(Es).

a --> {length(Es,10)}, seq(Es), {phrase(mynonterminal,Es)}.

If you are still not happy, then you want to express the intersection of two non-terminals. This is tantamount to asking the intersection of two context free languages which is in the general case undecidable. But much earlier, you will have problems with termination. So be aware of that in what follows:

:- op( 950, xfx, &).

(NT1 & NT2) -->
     call(Xs0^Xs^(phrase(NT1,Xs0,Xs),phrase(NT2,Xs0,Xs))).

The following is only needed if you do not use library(lambda):

^(V0, Goal, V0, V) :-
      call(Goal,V).

^(V, Goal, V) :-
     call(Goal).

So this permits you now to express the intersection of two non-terminals. But please, be aware that termination is very brittle here. In particular, the termination of the first non-terminal does not necessarily limit the second.

Breger answered 29/6, 2011 at 17:55 Comment(7)
I think the earlier part of this, dealing with "seq", is what I need (ie. one nonterminal is a finite list). But I'm not quite getting it to work perhaps because I don't understand. What is "phrase" in that first example?Billingsley
Prior to understanding the definition of (&)//2, try to understand the way how DCGs are encoded in Prolog. A good book on this is Prolog and Natural Language Analysis by Pereira and Shieber. mtome.com/Publications/PNLA/pnla.html (it's free)Breger
Hi false, thanks for the book reference. Will be useful. On my particular problem, I think your solution "a --> {length(Es,10)}, seq(Es), {phrase(mynonterminal,Es)}." looks almost exactly what I want. I just don't understand what, in my program, I should be writing where you've written "phrase". I can see the purpose of that part is to say that the sequence is made of mynonterminals. But what is that "phrase" predicate itself?Billingsley
ah. NO OK. Sorry. I see it's a built-in keyword.Billingsley
phrase/2 is a clean interface for predicates to call non-terminals. You could also call them directly, but then it is not evident if you call a predicate or non-terminal. So strictly speaking it is not needed, but much clearer to read.Breger
Do all Prologs translate a DCG call/n to call/n+2 with Input List and Output List added? Or is there another way to access the Input List and the Output List in a free manner? {}/1 will not do. But a definition: now(I^O^A,I,O) :- A. And then (A & B) --> now(I^O^(phrase(A,I,O),phrase(B,I,O)) would be more efficient I guess. The now/3 can be viewed as a generalized {}/1.Lowdown
All Prologs that translate p//n (note the double //) to p/n+2 will translate call//n to call/n+2. In the current ISO drafts for DCGs however, only call//1 is mentioned - since that is the crucial case.Breger
T
6

well, you can always use {} and write any kind of prolog predicate in-between, for example:

foo(X)-->
    { valid(X) },
    [a].
foo(X)-->
    [b].

so you could add some sort of word counter. of course, if each token is a word you could simply write something like: length(L,N), N<11, start(L,[]).

on the other hand, perhaps it will be better, depending on the complexity of the constrains, to encode them in a different part. something like parser->semantic checker in compilers.

Torras answered 29/6, 2011 at 17:56 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.