On lookbehind support
Major regex flavors have varying supports for lookbehind differently; some imposes certain restrictions, and some doesn't even support it at all.
- Javascript: not supported
- Python: fixed length only
- Java: finite length only
- .NET: no restriction
References
Python
In Python, where only fixed length lookbehind is supported, your original pattern raises an error because \d{1,2}
obviously does not have a fixed length. You can "fix" this by alternating on two different fixed-length lookbehinds, e.g. something like this:
(?<=^\d\/)\d{1,2}|(?<=^\d\d\/)\d{1,2}
Or perhaps you can put both lookbehinds as alternates of a non-capturing group:
(?:(?<=^\d\/)|(?<=^\d\d\/))\d{1,2}
(note that you can just use \d
without the brackets).
That said, it's probably much simpler to use a capturing group instead:
^\d{1,2}\/(\d{1,2})
Note that findall
returns what group 1 captures if you only have one group. Capturing group is more widely supported than lookbehind, and often leads to a more readable pattern (such as in this case).
This snippet illustrates all of the above points:
p = re.compile(r'(?:(?<=^\d\/)|(?<=^\d\d\/))\d{1,2}')
print(p.findall("12/34/56")) # "[34]"
print(p.findall("1/23/45")) # "[23]"
p = re.compile(r'^\d{1,2}\/(\d{1,2})')
print(p.findall("12/34/56")) # "[34]"
print(p.findall("1/23/45")) # "[23]"
p = re.compile(r'(?<=^\d{1,2}\/)\d{1,2}')
# raise error("look-behind requires fixed-width pattern")
References
Java
Java supports only finite-length lookbehind, so you can use \d{1,2}
like in the original pattern. This is demonstrated by the following snippet:
String text =
"12/34/56 date\n" +
"1/23/45 another date\n";
Pattern p = Pattern.compile("(?m)(?<=^\\d{1,2}/)\\d{1,2}");
Matcher m = p.matcher(text);
while (m.find()) {
System.out.println(m.group());
} // "34", "23"
Note that (?m)
is the embedded Pattern.MULTILINE
so that ^
matches the start of every line. Note also that since \
is an escape character for string literals, you must write "\\"
to get one backslash in Java.
C-Sharp
C# supports full regex on lookbehind. The following snippet shows how you can use +
repetition on a lookbehind:
var text = @"
1/23/45
12/34/56
123/45/67
1234/56/78
";
Regex r = new Regex(@"(?m)(?<=^\d+/)\d{1,2}");
foreach (Match m in r.Matches(text)) {
Console.WriteLine(m);
} // "23", "34", "45", "56"
Note that unlike Java, in C# you can use @-quoted string so that you don't have to escape \
.
For completeness, here's how you'd use the capturing group option in C#:
Regex r = new Regex(@"(?m)^\d+/(\d{1,2})");
foreach (Match m in r.Matches(text)) {
Console.WriteLine("Matched [" + m + "]; month = " + m.Groups[1]);
}
Given the previous text
, this prints:
Matched [1/23]; month = 23
Matched [12/34]; month = 34
Matched [123/45]; month = 45
Matched [1234/56]; month = 56
Related questions