Scala regex Named Capturing Groups
Asked Answered
M

2

19

In scala.util.matching.Regex trait MatchData I see that there support for groupnames , I thought that this was related to (Regex Named Capturing Groups)

But since Java does not support groupnames until version 7 as I understand it (ref), Scala version 2.8.0 (Java HotSpot(TM) 64-Bit Server VM, Java 1.6. gives me this exception:

scala> val pattern = """(?<login>\w+) (?<id>\d+)""".r
java.util.regex.PatternSyntaxException: Look-behind group does not have an obvio
us maximum length near index 11
(?<login>\w+) (?<id>\d+)
           ^
        at java.util.regex.Pattern.error(Pattern.java:1713)
        at java.util.regex.Pattern.group0(Pattern.java:2488)
        at java.util.regex.Pattern.sequence(Pattern.java:1806)
        at java.util.regex.Pattern.expr(Pattern.java:1752)
        at java.util.regex.Pattern.compile(Pattern.java:1460)

So the question is Named Capturing Groups supported in Scala? If so any examples out there?

Mccourt answered 12/6, 2010 at 18:17 Comment(0)
H
35

I'm afraid that Scala's named groups aren't defined the same way. It's nothing but a post-processing alias to unnamed (i.e. just numbered) groups in the original pattern.

Here's an example:

import scala.util.matching.Regex

object Main {
   def main(args: Array[String]) {
      val pattern = new Regex("""(\w*) (\w*)""", "firstName", "lastName");
      val result = pattern.findFirstMatchIn("James Bond").get;
      println(result.group("lastName") + ", " + result.group("firstName"));
   }
}

This prints (as seen on ideone.com):

Bond, James

What happens here is that in the constructor for the Regex, we provide the aliases for group 1, 2, etc. Then we can refer to these groups by those names. These names are not intrinsic in the patterns themselves.

Hewlett answered 12/6, 2010 at 18:40 Comment(4)
Thanks. There is no overloaded RichString.r for that.Mccourt
Is this still the behavior in scala 2.11 with Java7?Ferromagnetic
@javadba It still is. Now you can also do val pattern = """(\w*) (\w*)""".r("firstName", "lastName");Jacklight
@Jacklight Yes I had verified what you said. C'est la vie.Ferromagnetic
P
3

Scala does not have its own imlementation of regular expression matching. Instead the underlying regular expressions are Java's, so the details of writing patterns are those documented in java.util.regex.Pattern.

There you will find that the syntax you're using is actually that of the look-behind constraint, though according to the docs the < must be followed by either = (positive look-behind) or ! (negative look-behind).

Plattdeutsch answered 12/6, 2010 at 18:37 Comment(2)
This is something that should have been flagged as a bug a long time ago. Although it throws an exception when it's supposed to, the message should be "Unknown look-behind group" (as was obviously intended) or even better "Unknown group type". It is, of course, fixed in JDK 1.7 as a side effect of adding named groups.Stickler
This is no longer true starting from JDK7 - in which group names are supported in the JDK but not in scala.Ferromagnetic

© 2022 - 2024 — McMap. All rights reserved.