How to negate specific word in regex? [duplicate]

Asked 6/8, 2009 at 17:20 Answered 17/2, 2020 at 13:40

844

I know that I can negate group of chars as in [^bar] but I need a regular expression where negation applies to the specific word - so in my example how do I negate an actual bar, and not "any chars in bar"?

Resh answered 6/8, 2009 at 17:20 Comment(2)

Related: regex for matching something if it is not preceded by something else – Mcclurg 24/10, 2019 at 7:29

Something like this – Lyckman 7/1 at 19:23

1022

A great way to do this is to use negative lookahead:

^(?!.*bar).*$

The negative lookahead construct is the pair of parentheses, with the opening parenthesis followed by a question mark and an exclamation point. Inside the lookahead [is any regex pattern].

Asserted answered 6/8, 2009 at 17:38 Comment(18)

This says it all (I probably would have started with (?!bar) and built up). I don't see why other people are making it so complicated. – Burlington 7/8, 2009 at 14:49

line start character at the beginning does a pretty good job. – Jive 23/10, 2012 at 8:39

I don't think light weight regex parsers like SLRE support ! operator yet. – Celisse 28/2, 2014 at 12:49

Nicely done - matches a line that has the specified string and the string is not preceded by anything and the string is followed by anything.This is by definition the absence of the string! because if present it will always be preceded by something even if its a line anchor ^ – Frederique 13/11, 2014 at 15:35

Is there a version of this that works in the Linux command line grep utility? – Sofiasofie 27/9, 2015 at 2:3

@NeilTraft how about grep -v bar :) – Copalite 12/8, 2016 at 16:0

If you are using grep then use -P option. -P enables perl regex. e.g. grep -P '(?!do not contain this string)' – Rambouillet 21/9, 2016 at 13:32

this worked "just right" with the extra info provided by @sgrillon's answer – Cumulate 30/10, 2016 at 18:36

I want not allow to user to write "Password", "password" or any other exact word. – Dunois 31/3, 2017 at 6:15

Unfortunately, this doesn't works with actual words. foo will match, bar won't, but foobar or barfoo won't too! – Whilst 30/6, 2017 at 21:1

This is super useful for an idempotent ansible replace – Niigata 13/9, 2017 at 22:20

@Whilst That is correct and expected as those other three contain the "bar" so they shouldn't match. Foo is the only word of those three you gave that doesn't have the "bar" – Motherland 18/12, 2019 at 20:56

this is exactly what i needed .. but I'm curious why doesn't ^(?!bar).*$ work? It's technically saying if it doesn't contain 'bar' right? why does it require the .* I have checked and it actually doesn't, can anyone explain or break it down for me. – Motherland 18/12, 2019 at 21:1

this solution does not work in R. – Pre 6/10, 2020 at 16:59

@carilynchin its because ^ also applies within the lookahead. So you are saying you want all strings that don't start with bar. This means you will match all strings without bar AND all strings which have bar EXCEPT those that START with bar. That's not desired by OP. – Social 9/10, 2020 at 17:55

@Frederique - I didn't get your drift, but I suspect it's incorrect. – Doublet 6/8, 2021 at 11:28

Instead, I read ^(?!.*bar).*$ as "Match any string--it must NOT start with "any characters followed by 'bar'" --but it can have any other set of characters". The "must NOT start with ..." bit is ^(?!.*bar). The "can have any other..." bit is the final '.*$' – Doublet 6/8, 2021 at 11:35

Thinking about my text explanation above, I do not see a need for the final $ at the end - I think it can be dropped. So a slightly improved regex should be ^(?!.*bar).*. Can a regex guru validate this and update the answer please? – Doublet 6/8, 2021 at 11:43

Unless performance is of utmost concern, it's often easier just to run your results through a second pass, skipping those that match the words you want to negate.

Regular expressions usually mean you're doing scripting or some sort of low-performance task anyway, so find a solution that is easy to read, easy to understand and easy to maintain.

Chronogram answered 6/8, 2009 at 17:33 Comment(3)

There are lots of situations where you don't control the workflow: you just get to write a single regexp which is a filter. – Sackman 22/3, 2018 at 4:8

And if you want to replace all Texts which don't match a certain regex? – Gorrono 8/11, 2019 at 19:10

It special idea, but it does work. Most of the answers are for PCRE, but It can't apply their solution to re2 – Sammiesammons 22/4, 2021 at 6:49

Solution:

^(?!.*STRING1|.*STRING2|.*STRING3).*$

xxxxxx OK

xxxSTRING1xxx KO (is whether it is desired)

xxxSTRING2xxx KO (is whether it is desired)

xxxSTRING3xxx KO (is whether it is desired)

Wellgrounded answered 13/9, 2016 at 16:8 Comment(3)

thanks, this gave me the extra info i needed for multiple words – Cumulate 30/10, 2016 at 18:37

Am I the only one who hates "OK" and "KO" as indicators of passing a test? It's just one typo away from disaster... – Easting 13/12, 2021 at 7:32

@AJPerez, Yes OK KO is result of test – Maryrosemarys 13/11, 2023 at 8:41

You could either use a negative look-ahead or look-behind:

^(?!.*?bar).*
^(.(?<!bar))*?$

Or use just basics:

^(?:[^b]+|b(?:$|[^a]|a(?:$|[^r])))*$

These all match anything that does not contain bar.

Duce answered 6/8, 2009 at 17:24 Comment(8)

What languages don't support (negative) look-behinds and/or (negative) look-aheads in regex? – Generalissimo 6/8, 2009 at 17:29

I think the point being made is, looking at your pattern it's not at all clear that all you're doing is rejecting the word "bar". – Chronogram 6/8, 2009 at 17:34

@Bryan: And, in fact, it doesn't reject the word "bar". It just rejects "b" when followed by "ar". – Generalissimo 6/8, 2009 at 18:5

Good idea, but not supported everywhere. Afaik Javascript supports negative look-ahead, but not look-behind. I don't know details about other languages, but this can be helpful: en.wikipedia.org/wiki/Comparison_of_regular_expression_engines – Brayton 8/7, 2015 at 7:58

@Generalissimo bash doesn't support negative look-behind/look-ahead. – Dwightdwindle 2/3, 2016 at 14:18

@Generalissimo look-aheads and look-behinds are not posix – Greenway 6/1, 2018 at 10:23

Can you explain the second solution? (.(?<!bar))*? (?<!bar) is a negative lookbehind, isn't it? It follows the pattern (?<!a)b, that would mean: wherever you find a b, make sure there isn't an a before it. Only that in this case, b is empty for us; so it would mean: wherever you find anything, make sure there isn't a bar before it. But how does it work the (.<negative lookbehind>)*?? Why do you need the . and the last ? there? Many thanks! – Laraelaraine 21/5, 2019 at 10:22

` ^(?!.*?bar).* ` Why did you use lazy here ? Why does just ^(?!bar).* not work ? – Bridgettebridgewater 6/2, 2023 at 11:16

The following regex will do what you want (as long as negative lookbehinds and lookaheads are supported), matching things properly; the only problem is that it matches individual characters (i.e. each match is a single character rather than all characters between two consecutive "bar"s), possibly resulting in a potential for high overhead if you're working with very long strings.

b(?!ar)|(?<!b)a|a(?!r)|(?<!ba)r|[^bar]

Generalissimo answered 6/8, 2009 at 17:20 Comment(4)

Instead of those multiple updates which force us to read the wrong answers before getting to your final answer, why not rewrite your answer to be complete, but without the somewhat confusing bad parts? If somebody really cares about the edit history they can use the built-in features of this site. – Chronogram 19/6, 2012 at 13:12

Been two and a half years since I wrote this answer, but sure. – Generalissimo 19/6, 2012 at 14:39

damn that hurts, try this (?:(?!bar).)* – Chryso 7/10, 2014 at 18:15

@Mary, This won't work as expected. For example /(?:(?!bar).)*/g on foobar returns foo AND ar. – Eaddy 7/1, 2015 at 16:8

I came across this forum thread while trying to identify a regex for the following English statement:

Given an input string, match everything unless this input string is exactly 'bar'; for example I want to match 'barrier' and 'disbar' as well as 'foo'.

Here's the regex I came up with

^(bar.+|(?!bar).*)$

My English translation of the regex is "match the string if it starts with 'bar' and it has at least one other character, or if the string does not start with 'bar'.

Courtney answered 10/9, 2010 at 20:44 Comment(3)

@ReReqest - you will have much better chance to have this question answered if you post it as a separate question. In that you can provide link back to this question if you want. For the substance of question - it looks OK but I'm no regex guru – Resh 11/9, 2010 at 17:47

That was the one I was looking for. It really matches everything except bar. – Aulos 17/12, 2015 at 20:34

^(?!bar$).* matches the same as this (everything except exactly bar) and avoids repetition. – Padraig 6/6, 2018 at 13:17

The accepted answer is nice but is really a work-around for the lack of a simple sub-expression negation operator in regexes. This is why grep --invert-match exits. So in *nixes, you can accomplish the desired result using pipes and a second regex.

grep 'something I want' | grep --invert-match 'but not these ones'

Still a workaround, but maybe easier to remember.

Eclogue answered 4/1, 2016 at 0:4 Comment(3)

This is the right answer for someone using grep, which certainly qualifies as regex. I just wish this answer were more prominent (even included in the accepted answer) so that I hadn't spent time with the other answers first. – Kurtzman 31/12, 2019 at 14:12

I cant see the invert match option in R. Is it restricted to unix grep? – Pre 6/10, 2020 at 17:2

I use a GUI-based grep like TextCrawler. But if you are not using Windows OS, not sure what to use. – Pullover 5/8, 2022 at 5:35

Extracted from this comment by bkDJ:

^(?!bar$).*

The nice property of this solution is that it's possible to clearly negate (exclude) multiple words:

^(?!bar$|foo$|banana$).*

Hoppe answered 10/5, 2019 at 10:18 Comment(4)

why do you need trailing .*? – Brandtr 13/8, 2019 at 20:36

Because the negative lookahead doesn't match any characters. – Knecht 11/7, 2022 at 15:13

Seems to work by extracting the $, too: ^(?!(bar|foo|banana)$).* :-) – Coraliecoraline 18/7, 2022 at 11:32

@SashaBond without .*, it doesn't work. You can check here. – Lyckman 7/1 at 19:28

If it's truly a word, bar that you don't want to match, then:

^(?!.*\bbar\b).*$

The above will match any string that does not contain bar that is on a word boundary, that is to say, separated from non-word characters. However, the period/dot (.) used in the above pattern will not match newline characters unless the correct regex flag is used:

^(?s)(?!.*\bbar\b).*$

Alternatively:

^(?!.*\bbar\b)[\s\S]*$

Instead of using any special flag, we are looking for any character that is either white space or non-white space. That should cover every character.

But what if we would like to match words that might contain bar, but just not the specific word bar?

(?!\bbar\b)\b\[A-Za-z-]*bar[a-z-]*\b

(?!\bbar\b) Assert that the next input is not bar on a word boundary.
\b\[A-Za-z-]*bar[a-z-]*\b Matches any word on a word boundary that contains bar.

See Regex Demo

Shavonneshaw answered 17/2, 2020 at 13:40 Comment(0)

I wish to complement the accepted answer and contribute to the discussion with my late answer.

@ChrisVanOpstal shared this regex tutorial which is a great resource for learning regex.

However, it was really time consuming to read through.

I made a cheatsheet for mnemonic convenience.

This reference is based on the braces [], (), and {} leading each class, and I find it easy to recall.

Regex = {
 'single_character': ['[]', '.', {'negate':'^'}],
 'capturing_group' : ['()', '|', '\\', 'backreferences and named group'],
 'repetition'      : ['{}', '*', '+', '?', 'greedy v.s. lazy'],
 'anchor'          : ['^', '\b', '$'],
 'non_printable'   : ['\n', '\t', '\r', '\f', '\v'],
 'shorthand'       : ['\d', '\w', '\s'],
 }

Wigley answered 6/12, 2017 at 6:32 Comment(0)

Just thought of something else that could be done. It's very different from my first answer, as it doesn't use regular expressions, so I decided to make a second answer post.

Use your language of choice's split() method equivalent on the string with the word to negate as the argument for what to split on. An example using Python:

>>> text = 'barbarasdbarbar 1234egb ar bar32 sdfbaraadf'
>>> text.split('bar')
['', '', 'asd', '', ' 1234egb ar ', '32 sdf', 'aadf']

The nice thing about doing it this way, in Python at least (I don't remember if the functionality would be the same in, say, Visual Basic or Java), is that it lets you know indirectly when "bar" was repeated in the string due to the fact that the empty strings between "bar"s are included in the list of results (though the empty string at the beginning is due to there being a "bar" at the beginning of the string). If you don't want that, you can simply remove the empty strings from the list.

Generalissimo answered 7/8, 2009 at 19:58 Comment(1)

@Ajk_P yes but this kind of answers may help the OP think outside the box, they could've been fixated on regexes not realizing that it could be solved without them. – Funds 21/7, 2017 at 15:53

I had a list of file names, and I wanted to exclude certain ones, with this sort of behavior (Ruby):

files = [
  'mydir/states.rb',      # don't match these
  'countries.rb',
  'mydir/states_bkp.rb',  # match these
  'mydir/city_states.rb' 
]
excluded = ['states', 'countries']

# set my_rgx here

result = WankyAPI.filter(files, my_rgx)  # I didn't write WankyAPI...
assert result == ['mydir/city_states.rb', 'mydir/states_bkp.rb']

Here's my solution:

excluded_rgx = excluded.map{|e| e+'\.'}.join('|')
my_rgx = /(^|\/)((?!#{excluded_rgx})[^\.\/]*)\.rb$/

My assumptions for this application:

The string to be excluded is at the beginning of the input, or immediately following a slash.
The permitted strings end with .rb.
Permitted filenames don't have a . character before the .rb.

Contrarious answered 6/11, 2015 at 11:42 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags