Raku regex: How to use capturing group inside lookbehinds
Asked Answered
P

1

8

How can I use capturing groups inside lookbehind assertions?

I tried to use the same formula as in this answer. But that does not seem to work with lookbehinds.

Conceptually, this is what I was trying to do.

say "133" ~~ m/ <?after $0+> (\d) $ /

I know this can be easily achieved without lookbehinds, but ignore that just for now :)

For this I tried with these options:

Use :var syntax:

say "133" ~~ m/ <?after $look-behind+> (\d):my $look-behind; $ /;
# Variable '$look-behind' is not declared

Use code block syntax defining the variable outside:

my $look-behind;
say "133" ~~ m/ <?after $look-behind+> (\d) {$look-behind=$0} $ /;
# False

It seems that the problem is that the lookbehind is executed before the "code block/:my $var", and thus the variable is empty for the lookbehind tree.

Is there a way to use capturing groups inside lookbehinds?

Pastose answered 24/12, 2020 at 11:19 Comment(6)
A, er, curiosity: say .<after> given foo ~~ m/ $ <after bar> / will work for any string in foo that ends with the pattern bar, which is fair enough... and displays a capture that's the string in foo in its entirety (not just bar) and entirely flipped (backwards)! 🤪Heinrick
"I know this can be easily achieved without lookbehinds, but ignore that just for now :)" I've ignored that for a day, but now's a new now. :) So, what is it that you're really trying to do?Heinrick
Regexes in Raku are code. You wouldn't expect a variable to have data in it before you set it, so why would you expect the same from a Regex?Kile
Hi @Heinrick I tried to solve Day 15 of advent of code using regexes (and inline code in them). I finally reversed the input numbers and used lookeaheads instead of lookbehinds. But, I figured there would exist a way to reference capturing groups in lookbehinds using raku, so I asked... :)Pastose
@BradGilbert Well, I was not expecting anything xD, that was something I tried just in case...Pastose
@Pastose Thanks. I figured out what you were asking after I posted my comments. In case you haven't yet figured this out, using jnthn's answer to your previous Q about lookahead (/ (a) :my $lookahead; <?before b {$lookahead = $/}> /), a simple translation of that to a lookbehind version solving this Q would be / (a) :my $lookbehind; { $lookbehind = $/ } <?after $lookbehind ** 2> $ /;. (Which is more or less the same as .@WiktorStribiżew's answer.)Heinrick
M
8

When you reference a captured value before it is actually captured, it is not initialized, hence you can't get a match. You need to define the capturing group before actually using the backreference to the captured value.

Next, you need to define a code block and assign the backreference to a variable to be used throughout the regex pattern, else, it is not visible to the lookbehind pattern. See this Capturing Raku reference:

This code block publishes the capture inside the regex, so that it can be assigned to other variables or used for subsequent matches

You can use something like

say "133" ~~ m/ (\d) {} :my $c=$0; <?after $c ** 2> $ /;

Here, (\d) matches and captures a digit, then a code block is used to assign this captured value to a $c variable, and then the <?after $c ** 2> lookbehind checks if the $c value appears at least twice immediately to the left of the current location, and then the $ anchor checks if the current position is the end of the string.

See this online Raku demo.

Mandler answered 24/12, 2020 at 12:56 Comment(2)
Superb! Nice trick to postpone the lookbehind and read again the captured group on it!Pastose
It would be worth pointing out that the reason {} is needed is so that $/ gets updated. ( $0 is really a shortcut for $/[0])Kile

© 2022 - 2024 — McMap. All rights reserved.