Is there a BNF with arguments for non-terminal symbols?
Asked Answered
D

1

4

In working with Prolog DCG to parse input it is nice to have an accompaning BNF of the grammar.

For example:

BNF

<Sentence> ::= <Noun_phrase> <Verb_phrase>
<Noun_phrase> ::= <Determiner> <Noun>
<Verb_phrase> ::= <Verb> <Phrase>
<Determiner> ::= a
<Determiner> ::= the
<Noun> ::= cat
<Noun> ::= mouse
<Verb> ::= scares
<Verb> ::= hates

as Prolog DCG

sentence --> noun_phrase, verb_phrase.
verb_phrase --> verb, noun_phrase.
noun_phrase --> determiner, noun.
determiner --> [a].
determiner --> [the].
noun --> [cat].
noun --> [mouse].
verb --> [scares].
verb --> [hates].

However Prolog DCG can also have arguments as
in this example Number for singular or plural

sentence(Number) --> noun_phrase(Number), verb_phrase(Number).
verb_phrase(Number) --> verb(Number), noun_phrase(Number).
noun_phrase(Number) --> determiner(Number), noun(Number).
determiner(singular) --> [a].
determiner(singular) --> [the].
determiner(plural) --> [the].
noun(singular) --> [cat].
noun(plural) --> [cats].
noun(singular) --> [mouse].
noun(plural) --> [mice].
verb(singular) --> [scares].
verb(plural) --> [scare].
verb(singular) --> [hates].
verb(plural) --> [hate].

Is there a standard or accepted extension to BNF that includes arguments for non-terminals?

If so I need a link to it.

I suspect that ATN (Augmented Transition Networks) are in the ball park and may be the only standard answer, but I am hoping for something that is linear text as opposed to some form vertex/edge graph.

Defilade answered 10/1, 2017 at 17:54 Comment(1)
Of interest: Introductory NLP course that includes FSA, RTN, etc. with Prolog codeDefilade
M
2

I think the concept of feature structures is what you're looking for; the sharing of arguments you show in your example is a special case of the more general feature structure unification approach.

I'm not aware of feature structure extensions for BNF specifically, but there are reasonably accepted notations for adding them to other grammar formalisms. The documentation for NLTK (a Python natural language processing library) has an example here of the notation they use. Here are some of their rules that correspond roughly to the first few productions from your example:

S -> NP[CASE=nom, AGR=?a] VP[AGR=?a]
VP[AGR=?a] -> TV[OBJCASE=?c, AGR=?a] NP[CASE=?c]
NP[CASE=?c, AGR=?a] -> Det[CASE=?c, AGR=?a] N[CASE=?c, AGR=?a]

The ?x is their notation for logic variables. That entire chapter of the NLTK manual contains a general desciption of feature structures and includes some literature references.

Marshmallow answered 11/1, 2017 at 9:42 Comment(5)
I find it interesting in that the more I look for standard ways to do this the more I come to believe that Prolog DCG might be the better way to represent these concepts in linear text. I do think the edge/vertex graphs are better, but when documenting in a text only manner such as with source code, diagrams don't work, thus the need. Your answer is helping me learn.Defilade
Of interest: Introduction to NLP - Lecture 16: Feature structures and unification and syllabus by Julia HockenmaierDefilade
Of interest: Feature Structure Unification Grammars - FSUGs are very compact and abstract in comparison to DCGs and the difference is even more marked when compared to CFGs.Defilade
Of interest: Feature Structures & Unification - Has nice examples of text representations and of unification trace. Search for Unification trace:Defilade
For now I will be sticking with just the Prolog DCG. In researching the details I learned a few things. :)Defilade

© 2022 - 2024 — McMap. All rights reserved.