Python Source Code - Update Grammar
Asked Answered
T

1

8

I am studying a little about the source code of Python and I decided to put into practice some changes in the grammar, so I downloaded the source code of version 3.7.

I am following the guidelines of PEP 0306:
https://www.python.org/dev/peps/pep-0306/

And from the example of Hackernoon:
https://hackernoon.com/modifying-the-python-language-in-7-minutes-b94b0a99ce14

The idea came from the improvement in the syntax of decorators (remember, it's just an example of a study, I already know there are other ways to do the same thing):

@test
def mydef (self):
    pass

It works perfectly well, following the line of the Grammar/Grammar file:

decorated: decorators (classdef | funcdef | async_funcdef)

Now the goal is to change the decorator to accept declarations, starting with the example:

@test
id: int = 1

Analyzing the grammar, I found the annassign, which would be:

annassign: ':' test ['=' test]
# or even use small_stmt

Given the token representing id: int = 1, I changed the token to:

decorated: decorators (classdef | funcdef | async_funcdef | annassign)

Done that (following PEP 0306) I went to ast.c and identified the ast_for_decorated method, and getting the piece of code:

[...]
assert(TYPE(CHILD(n, 1)) == funcdef ||
       TYPE(CHILD(n, 1)) == async_funcdef ||
       TYPE(CHILD(n, 1)) == classdef);

if (TYPE(CHILD(n, 1)) == funcdef) {
  thing = ast_for_funcdef(c, CHILD(n, 1), decorator_seq);
} else if (TYPE(CHILD(n, 1)) == classdef) {
  thing = ast_for_classdef(c, CHILD(n, 1), decorator_seq);
} else if (TYPE(CHILD(n, 1)) == async_funcdef) {
  thing = ast_for_async_funcdef(c, CHILD(n, 1), decorator_seq);
}
[...]

You can verify that there is validation of the next token (function, class, or async), then calling the responsible method (ast_for). So I made the changes based on ast.c:

[...]
assert(TYPE(CHILD(n, 1)) == funcdef ||
       TYPE(CHILD(n, 1)) == async_funcdef ||
       TYPE(CHILD(n, 1)) == annassign ||
       TYPE(CHILD(n, 1)) == classdef);

if (TYPE(CHILD(n, 1)) == funcdef) {
  thing = ast_for_funcdef(c, CHILD(n, 1), decorator_seq);
} else if (TYPE(CHILD(n, 1)) == annassign) {
  thing = ast_for_annassign(c, CHILD(n, 1));
} else if (TYPE(CHILD(n, 1)) == classdef) {
  thing = ast_for_classdef(c, CHILD(n, 1), decorator_seq);
} else if (TYPE(CHILD(n, 1)) == async_funcdef) {
  thing = ast_for_async_funcdef(c, CHILD(n, 1), decorator_seq);
}
[...]

Notice that I created the ast_for_annassign method, which contains the same verification code present in ast_for_expr_stmt for annassing:

static stmt_ty 
ast_for_annassign(struct compiling *c, const node *n)
{
    REQ(n, expr_stmt);
    expr_ty expr1, expr2, expr3;
    node *ch = CHILD(n, 0);
    node *deep, *ann = CHILD(n, 1);
    int simple = 1;

    /* we keep track of parens to qualify (x) as expression not name */
    deep = ch;
    while (NCH(deep) == 1) {
        deep = CHILD(deep, 0);
    }
    if (NCH(deep) > 0 && TYPE(CHILD(deep, 0)) == LPAR) {
        simple = 0;
    }
    expr1 = ast_for_testlist(c, ch);
    if (!expr1) {
        return NULL;
    }
    switch (expr1->kind) {
        case Name_kind:
            if (forbidden_name(c, expr1->v.Name.id, n, 0)) {
                return NULL;
            }
            expr1->v.Name.ctx = Store;
            break;
        case Attribute_kind:
            if (forbidden_name(c, expr1->v.Attribute.attr, n, 1)) {
                return NULL;
            }
            expr1->v.Attribute.ctx = Store;
            break;
        case Subscript_kind:
            expr1->v.Subscript.ctx = Store;
            break;
        case List_kind:
            ast_error(c, ch,
                      "only single target (not list) can be annotated");
            return NULL;
        case Tuple_kind:
            ast_error(c, ch,
                      "only single target (not tuple) can be annotated");
            return NULL;
        default:
            ast_error(c, ch,
                      "illegal target for annotation");
            return NULL;
    }

    if (expr1->kind != Name_kind) {
        simple = 0;
    }
    ch = CHILD(ann, 1);
    expr2 = ast_for_expr(c, ch);
    if (!expr2) {
        return NULL;
    }
    if (NCH(ann) == 2) {
        return AnnAssign(expr1, expr2, NULL, simple,
                         LINENO(n), n->n_col_offset, c->c_arena);
    }
    else {
        ch = CHILD(ann, 3);
        expr3 = ast_for_expr(c, ch);
        if (!expr3) {
            return NULL;
        }
        return AnnAssign(expr1, expr2, expr3, simple,
                         LINENO(n), n->n_col_offset, c->c_arena);
    }
}

Now it was time to test (configure / make -j / make install), python3.7 and:

File "__init__.py", line 13
id: int = 1
^
SyntaxError: invalid syntax

With changes to the grammar and lexical parser, should the compiler interpret tokens as valid, where am I going wrong?

Trina answered 17/8, 2018 at 19:18 Comment(0)
E
2

id: int = 1 is not an annassign. The : int = 1 part is an annassign. (Even the line terminator doesn't count as part of the annassign.) There is no nonterminal in the Python grammar specifically for an annotated assignment statement; you may have to write one.

Expressway answered 17/8, 2018 at 20:3 Comment(1)
I understood your point, basically you are suggesting to create: new: NAME ':' test ['=' test] like funcdef: 'def' NAME parameters ['->' test] ':' suiteTrina

© 2022 - 2024 — McMap. All rights reserved.