How can you write a customizable grammar?
Asked Answered
U

1

6

For a chat bot I'm writing, I want to make its parser customizable so people don't need to modify the bot itself to add hooks for whatever types of chat messages they want to. The parser uses a grammar. At the moment, I handle this with a class that looks something like this:

class Rule {
    has Regex:D $.matcher is required;
    has         &.parser  is required;

    method new(::?CLASS:_: Regex:D $matcher, &parser) {
        self.bless: :$matcher, :&parser
    }

    method match(::?CLASS:D: Str:D $target --> Replier:_) {
        $target ~~ $!matcher;
        $/.defined ?? &!parser(self, $/) !! Nil
    }
}

An array of these would then be looped through from the parser's actions class. This allows for people to add their own "rules" for the parser, which solves my problem, but this is clunky and this is reinventing grammars! What I really want is for people to be able to write something like a slang for my parser. While augment could be used for this, it wouldn't be useful in this case since it's possible the user would want to change how they augment the parser during runtime, but augment is handled during compile-time. How can this be done?

Underbid answered 20/12, 2019 at 13:38 Comment(0)
U
7

All this takes is 5 or 10 lines of boilerplate, depending on whether or not you use an actions class.

If you take a look at Metamodel::GrammarHOW, as of writing, you'll find this:

class Perl6::Metamodel::GrammarHOW
    is Perl6::Metamodel::ClassHOW
    does Perl6::Metamodel::DefaultParent
{
}

Grammars are an extension of classes! This means it's possible to declare metamethods in them. Building on How can classes be made parametric in Perl 6?, if the user provides roles for the grammar and actions class, they can be mixed in before parsing via parameterization. If you've written a slang before, this might sound familiar; mixing in roles like this is how $*LANG.refine_slang works!

If you want a token in a grammar to be augmentable, you would make it a proto token. All that would be needed afterwards is a parameterize metamethod that mixes in its argument, which would be a role of some kind:

grammar Foo::Grammar {
    token TOP { <foo> }

    proto token foo          {*}
          token foo:sym<foo> { <sym> }

    method ^parameterize(Foo::Grammar:U $this is raw, Mu $grammar-role is raw --> Foo::Grammar:U) {
        my Foo::Grammar:U $mixin := $this.^mixin: $grammar-role;
        $mixin.^set_name: $this.^name ~ '[' ~ $grammar-role.^name ~ ']';
        $mixin
    }
}

class Foo::Actions {
    method TOP($/) { make $<foo>.made; }

    method foo:sym<foo>($/) { make ~$<sym>; }

    method ^parameterize(Foo::Actions:U $this is raw, Mu $actions-role is raw --> Foo::Actions:U) {
        my Foo::Actions:U $mixin := $this.^mixin: $actions-role;
        $mixin.^set_name: $this.^name ~ '[' ~ $actions-role.^name ~ ']';
        $mixin
    }
}

Then the roles to mix in can be declared like so:

role Bar::Grammar {
    token foo:sym<bar> { <sym> }
}

role Bar::Actions {
    method foo:sym<bar>($/) { make ~$<sym>; }
}

Now Foo can be augmented with Bar before parsing if desired:

Foo::Grammar.subparse: 'foo', actions => Foo::Actions.new;
say $/ && $/.made; # OUTPUT: foo
Foo::Grammar.subparse: 'bar', actions => Foo::Actions.new;
say $/ && $/.made; # OUTPUT: #<failed match>

Foo::Grammar[Bar::Grammar].subparse: 'foo', actions => Foo::Actions[Bar::Actions].new;
say $/ && $/.made; # OUTPUT: foo
Foo::Grammar[Bar::Grammar].subparse: 'bar', actions => Foo::Actions[Bar::Actions].new;
say $/ && $/.made; # OUTPUT: bar

Edit: the mixin metamethod can accept any number of roles as arguments, and parameterization can work with any signature. This means you can make parameterizations of grammars or actions classes accept any number of roles if you tweak the parameterize metamethod a bit:

method ^parameterize(Mu $this is raw, *@roles --> Mu) {
    my Mu $mixin := $this.^mixin: |@roles;
    $mixin.^set_name: $this.^name ~ '[' ~ @roles.map(*.^name).join(', ') ~ ']';
    $mixin
}
Underbid answered 20/12, 2019 at 13:38 Comment(1)
This is a very cool solution. I probably would have had users need to instantiate to do the mixin via does/but. Only thing I'd change here (maybe as a step 2 if you're writing a tutorial on it ;-) ) is to allow an iterable so multiple things can be mixed in simultaneouslyPutter

© 2022 - 2024 — McMap. All rights reserved.