Java regex error - Look-behind group does not have an obvious maximum length
Asked Answered
A

3

20

I get this error:

java.util.regex.PatternSyntaxException: Look-behind group does not have an
    obvious maximum length near index 22
([a-z])(?!.*\1)(?<!\1.+)([a-z])(?!.*\2)(?<!\2.+)(.)(\3)(.)(\5)
                      ^

I'm trying to match COFFEE, but not BOBBEE.

I'm using java 1.6.

Apograph answered 25/9, 2011 at 4:59 Comment(0)
R
13

Java doesn't support variable length in look behind.
In this case, it seems you can easily ignore it (assuming your entire input is one word):

([a-z])(?!.*\1)([a-z])(?!.*\2)(.)(\3)(.)(\5)

Both lookbehinds do not add anything: the first asserts at least two characters where you only had one, and the second checks the second character is different from the first, which was already covered by (?!.*\1).

Working example: http://regexr.com?2up96

Rapids answered 25/9, 2011 at 5:12 Comment(1)
It is probably not correct to say that Java does not support "variable length" in the look behind regex. For example, the following works: (?<=(a*))afsUrinary
C
20

To avoid this error, you should replace + with a region like {0,10}:

([a-z])(?!.*\1)(?<!\1.{0,10})([a-z])(?!.*\2)(?<!\2.{0,10})(.)(\3)(.)(\5)
Cousins answered 13/12, 2011 at 3:16 Comment(3)
I was using this (?<=(OF[ ]{0,5})|(MIRS[ ]{0,5}))[0-9]+ and it was giving me the same error Look-behind group does not have an obvious maximum length near index 22 then after reading your comment i changed the * to {0,5} and it worked perfect. Wonder why java doesnt support thisStannum
Overall, this is kind of a dirty workaround, but yea, if the piece is guaranteed to not be too long, just substituting a + with something like {1,999} works fine.Modulation
Do note, if you're replacing a + you should start from a minimum of 1, not 0.Modulation
R
13

Java doesn't support variable length in look behind.
In this case, it seems you can easily ignore it (assuming your entire input is one word):

([a-z])(?!.*\1)([a-z])(?!.*\2)(.)(\3)(.)(\5)

Both lookbehinds do not add anything: the first asserts at least two characters where you only had one, and the second checks the second character is different from the first, which was already covered by (?!.*\1).

Working example: http://regexr.com?2up96

Rapids answered 25/9, 2011 at 5:12 Comment(1)
It is probably not correct to say that Java does not support "variable length" in the look behind regex. For example, the following works: (?<=(a*))afsUrinary
G
5

Java takes things a step further by allowing finite repetition. You still cannot use the star or plus, but you can use the question mark and the curly braces with the max parameter specified. Java determines the minimum and maximum possible lengths of the lookbehind.
The lookbehind in the regex (?<!ab{2,4}c{3,5}d)test has 6 possible lengths. It can be between 7 to 11 characters long. When Java (version 6 or later) tries to match the lookbehind, it first steps back the minimum number of characters (7 in this example) in the string and then evaluates the regex inside the lookbehind as usual, from left to right. If it fails, Java steps back one more character and tries again. If the lookbehind continues to fail, Java continues to step back until the lookbehind either matches or it has stepped back the maximum number of characters (11 in this example). This repeated stepping back through the subject string kills performance when the number of possible lengths of the lookbehind grows. Keep this in mind. Don't choose an arbitrarily large maximum number of repetitions to work around the lack of infinite quantifiers inside lookbehind. Java 4 and 5 have bugs that cause lookbehind with alternation or variable quantifiers to fail when it should succeed in some situations. These bugs were fixed in Java 6.

Copied from Here

Gobbledegook answered 30/5, 2014 at 3:14 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.