BNF grammar definition for file path wildcard (glob)
Asked Answered
R

1

7

I'm searching for some widely extended dialect (like this one https://github.com/vmeurisse/wildmatch + globstar **) described with BFN rules.

In any format or language. OMeta or PEG would be great.

Reliance answered 7/9, 2014 at 14:51 Comment(0)
K
2

I'm not sure to understand your question since the grammar for file path wildcard can be reduced to a simple regular expression. This grammar is defined by the Unix Shell.

You can find the BNF for Bash here: http://my.safaribooksonline.com/book/operating-systems-and-server-administration/unix/1565923472/syntax/lbs.appd.div.3

In Python programming language, a definition of the glob.glob() function is available in the documentation. This function use the fnmatch.fnmatch() function to perform the pattern matching. The documentation is available here: https://docs.python.org/2/library/fnmatch.html#fnmatch.fnmatch.

The fnmatch.fnmatch function translate a file path wildcard pattern to a classic regular expression, like this:

def translate(pat):
    """Translate a shell PATTERN to a regular expression.

    There is no way to quote meta-characters.
    """

    i, n = 0, len(pat)
    res = ''
    while i < n:
        c = pat[i]
        i = i+1
        if c == '*':
            res = res + '.*'
        elif c == '?':
            res = res + '.'
        elif c == '[':
            j = i
            if j < n and pat[j] == '!':
                j = j+1
            if j < n and pat[j] == ']':
                j = j+1
            while j < n and pat[j] != ']':
                j = j+1
            if j >= n:
                res = res + '\\['
            else:
                stuff = pat[i:j].replace('\\','\\\\')
                i = j+1
                if stuff[0] == '!':
                    stuff = '^' + stuff[1:]
                elif stuff[0] == '^':
                    stuff = '\\' + stuff
                res = '%s[%s]' % (res, stuff)
        else:
            res = res + re.escape(c)
    return res + '\Z(?ms)'

That can help you to write de BNF grammar...

EDIT

Here is a very simple grammar:

wildcard : expr
         | expr wildcard

expr : WORD
     | ASTERIX
     | QUESTION
     | neg_bracket_expr
     | pos_bracket_expr

pos_bracket_expr : LBRACKET WORD RBRACKET

neg_bracket_expr : LBRACKET EXCLAMATION WORD RBRACKET

A list of popular grammars parsed by the famous ANTLR tool is available here: http://www.antlr3.org/grammar/list.html.

Kashmir answered 19/9, 2014 at 20:25 Comment(1)
the grammar for file path wildcard can be reduced to a simple regular expression well, actually yes. There is the way to write regular expression to transform pattern into another regular expression that can match path. But this solution lack error handling in pattern. Also I need rich grammar implementation to make my own dialect.Reliance

© 2022 - 2024 — McMap. All rights reserved.