regexp to match string1 unless preceded by string2
Asked Answered
V

5

16

Using Ruby, how can I use a single regex to match all occurrences of 'y' in "xy y ay xy +y" that are NOT preceded by x (y, ay, +y)?
/[^x]y/ matches the preceding character too, so I need an alternative...

Vallombrosa answered 31/7, 2009 at 16:23 Comment(4)
I was going to suggest a negative lookbehind, but it looks like you're out of luck on that score: regular-expressions.info/lookaround.html#limitbehind. "Finally, flavors like JavaScript, Ruby and Tcl do not support lookbehind at all, even though they do support lookahead."Genova
See also #530941Phylloquinone
If look-behind assertions are not supported, we might use this: ([^x]y")|(^y"). 'y' might appear at the start of the string and that case is not covered by /[^x]y/.Transmute
FYI: Ruby 2.0.0 and up DOES support lookbehind. See docs.ruby-lang.org/en/2.0.0/…Seabolt
M
35

You need a zero-width negative look-behind assertion. Try /(?<!x)y/ which says exactly what you're looking for, i.e. find all 'y' not preceeded by 'x', but doesn't include the prior character, which is the zero-width part.

Edited to add: Apparently this is only supported from Ruby 1.9 and up.

Monomerous answered 31/7, 2009 at 16:27 Comment(0)
H
3

Negative look-behind is not supported until Ruby 1.9, but you can do something similar using scan:

"xy y ay xy +y".scan(/([^x])(y)/) # => [[" ", "y"], ["a", "y"], ["+", "y"]]
"xy y ay xy +y".scan(/([^x])(y)/).map {|match| match[1]}  # => ["y", "y", "y"]

Of course, this is much more difficult if you want to avoid much more than a single character before the y. Then you'd have to do something like:

"abby y crabby bally +y".scan(/(.*?)(y)/).reject {|str| str[0] =~ /ab/}  # => [[" ", "y"], [" ball", "y"], [" +", "y"]]
"abby y crabby bally +y".scan(/(.*?)(y)/).reject {|str| str[0] =~ /ab/}.map {|match| match[1]}  # => ["y", "y" "y"]
Herringbone answered 31/7, 2009 at 16:51 Comment(0)
S
2

Ruby unfortunately doesn't support negative lookbehind, so you'll have trouble if you need to look for more than a single character. For just one character, you can take care of this by capturing the match:

/[^x](y)/
Sumikosumma answered 31/7, 2009 at 16:29 Comment(2)
Ah, I'm not a Ruby expert - jbourque says negative lookbehind is in newer Ruby. There's your real answer.Sumikosumma
[^x] must match one character. if y occurs at the beginning of the line we should match it: /(?:\A|^|[^x])y/Afghan
C
-1

In PCRE, you use a negative look-behind:

(:<!x)y

Not sure if this is supported by Ruby, but you can always look up.

Cent answered 31/7, 2009 at 16:27 Comment(0)
B
-1

It can be done with negative look behind, (?<!x)y

Bristle answered 31/7, 2009 at 16:28 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.