How can I obtain a pointer to a Grammar token or regex?
Asked Answered
P

1

8

This is similar to this question for classes, except the same procedure does not seem to work for Grammars.

grammar TestGrammar {
    token num { \d+ }
}


my $test-grammar = TestGrammar.new();
my $token = $test-grammar.^lookup('num');

say "3" ~~ $token;

This returns:

Type check failed in binding to parameter '<anon>'; expected TestGrammar but got Match (Match.new(:orig("3")...)
  in regex num at pointer-to-token.raku line 2
  in block <unit> at pointer-to-token.raku line 9

This seems to point to the fact that you need binding to a class/grammar, and not a "bare" token. However, it's not clear how to do that. Passing grammar or an instance of it as a parameter returns a different error:

Cannot look up attributes in a TestGrammar type object. Did you forget a '.new'?

Any idea of why this does not really work?

Update: using ^find_method as indicated in this question that is referenced from the one above does not work either. Same issue. Using assuming does not fix it either.

Update 2: I seem to be getting somewhere here:

my $token = $test-grammar.^lookup('num').assuming($test-grammar);

say "33" ~~ $token;

Does not yield any syntax error, however it returns False no matter what.

Putsch answered 20/12, 2022 at 9:24 Comment(3)
What about just using the regex's name as the reference? For example, grammar foo { token bar { . } } my $rule = 'bar'; say foo.parse(:$rule, 9); # 「9」?Autotruck
@Autotruck I was going to answer it myself today in that direction; using subparse and not parse, but close enough. Why don't you do it yourself so that I can accept it as an answer? Maybe you don't want because it's not exactly what I was asking (still, getting the pointer does not seem to be possible), but it would serve the same purpose.Putsch
I've figured out what I think is a better answer. It doesn't bother with .parse or .subparse. I don't have time to properly write it up tonight. The very rough version I have right now is grammar foo { token bar { \d+ } }; my &match = { ($^grammar.^lookup: $^rule)($grammar.new: orig => $^text) }; say '42' ~~ match(foo, 'bar', $_);. But I want something much closer to your original question and to add explanation.Autotruck
A
2

You're missing an argument at the end of your code:

grammar TestGrammar {
    token num { \d+ }
}

my $test-grammar = TestGrammar.new();
my $token = $test-grammar.^lookup('num');

say "3" ~~ $token($test-grammar.new: orig => $_);
                 ^^ -- the missing/new bit -- ^^

I'm confident you can more or less tuck that argument away -- but .assuming immediately reifies/evaluates the argument(s) being assumed so that won't work out. Instead we need to postpone that step until the smart match call (to get ahold of the $_ as it is during the smart match call).


We need to change the $token declaration and call. Here are two possibilities I can think of:

  • Stick with $token

    Change its declaration, and turn its use with ~~ into a method call:

    my $token = { $test-grammar.^lookup('num')($test-grammar.new: orig => $_) }
    
    say "3" ~~ .$token;
               ^ insert dot to make it a method call with `$_` as invocant
    
  • Switch to &token

    Now there's no need for the dot in the smart match line. Even better, you can drop the sigil:

    my &token = { $test-grammar.^lookup('num')($test-grammar.new: orig => $_) }
    
    say "3" ~~ .&token; # Same as:
    say "3" ~~  &token; # Same as:
    say "3" ~~   token;
    

A proper answer to your question should really provide a decent answer to these three questions:

  • Why does one have to pass a new grammar object?

  • What's this orig business?

  • How could anyone have known this?

I'm not going to answer those questions, at least not adequately/tonight, and perhaps never. (I recall investigating this years ago and getting bogged down in Rakudo code.)

From memory, my working hypothesis boils down:

  • There's a fundamental aspect of the regex/grammar machinery wherein it presumes a match/grammar object is setup at the start (and then passed along to subrules as matching happens).

  • There's a difference between a method/rule declared with my vs with has regarding how that happens. (Presumably what self is bound to.)

  • That difference means user code has to deal with this disparity in the scenario covered by your question.

Autotruck answered 15/1, 2023 at 21:31 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.