Regex "NOT" in Google Sheets (RE2)
Asked Answered
M

3

6

I want to check if in a cell there is one word but not another. In this post there was some buzz around this matter but the elected solution included a script function. There has to be an easier way to do that.

I want to check if in the string "investimentos" the string "investimentos" is present and "fundos" is not.

I've already tried those below (disclaimer: I'm a beginner with regex):

=regexmatch("investimentos";"(investimentos)^(fundos)")
=regexmatch("investimentos";"(investimentos).*^(fundos)")
=regexmatch("investimentos";"(investimentos)(^fundos)")
=regexmatch("investimentos";"(investimentos).*(^fundos)")

I'm always getting false. Why is that?

Misguided answered 22/2, 2019 at 21:33 Comment(6)
Same as in my previous answer: there is no lookaround support in RE2, so you cannot use that logic in the RE2 expression. Use =AND(REGEXMATCH(A1;"word1");NOT(REGEXMATCH(A1;"word2")))Beniamino
thanks. your again.Misguided
You could do that with a single regex if the negated value is just 1 character. Like ^[^I]*GO[^I]*$ will match a string that has no I but contains GO, but in case there are multiple chars in that word, it won't work.Beniamino
github.com/google/re2/wiki/SyntaxDecompound
Regarding the edits in this question, I have given way on the title, even if I tend to think of all-caps material as a bit shouty. However, I have removed the request to readers of "shining a light on it" as that is rather vague, and is not as readily understandable as "Why does X happen". We have guidance on vague questions in this discussion, which is worth a read.Franconian
The Stack Overflow regex tag info page has some general notes on negation in regular expressions.Waal
A
10

There is no lookaround support in RE2, so you cannot use the common logic to match one string excluding another.

You could do that with a single regex if the negated value is just 1 character. Like ^[^I]*GO[^I]*$ will match a string that has no I but contains GO, but in case there are multiple chars in the word you want to exclude, it won't work.

Use

=AND(REGEXMATCH(A1;"investimentos");NOT(REGEXMATCH(A1;"fundos")))
Alkalinity answered 22/2, 2019 at 21:43 Comment(0)
D
2

try:

=(REGEXMATCH(A1; "investimentos"))*(NOT(REGEXMATCH(A1; "fundos")))

or:

=(REGEXMATCH(A1; "investimentos"))*(REGEXMATCH(A1; "[^fundos]"))
Decompound answered 23/12, 2019 at 1:1 Comment(2)
Hi @Decompound is it possible to do the same with a query formula while keeping the blank cels vertically? I have this formula that works but it filters intermediary blank cells: =QUERY(B10:B,"SELECT B WHERE B MATCHES '.+(\.|\?|\!|\.\.\.)[\sA-Z].+|.+[a-z(\.|\?|\!|\.\.\.)]$' AND NOT B MATCHES 'Zn(\.|\?|\!|\.\.\.)[\sA-Z].+|.+[a-z(\.|\?|\!|\.\.\.)]$'",0)Fullblooded
This one is equivalent to the Query formula above and gives the expected result with intermediary blank cells =IFNA(ArrayFormula( IFS(ISNUMBER(B10:B),"", REGEXMATCH(B10:B,".+(\.|\?|\!|\.\.\.)[\sA-Z].+|(.+\.)$")=FALSE,"", (REGEXMATCH(B10:B,".+(\.|\?|\!|\.\.\.)[\sA-Z].+|(.+\.)$")) * (NOT(REGEXMATCH(B10:B,"Zn(\.|\?|\!|\.\.\.)[\sA-Z].+"))),B10:B)),"") But I'd prefer using the Query formula if possible.Fullblooded
G
1

If, like me, you're trying to use regexmatch within a filter function, the combination of and and not functions doesn't seem to work - I'm not sure why but the error suggests it's only returning one row, rather than all the rows the positive and negative regexmatch functions match too.

However, @player0's suggestion of combining the * and not functions to combine the positive and negative regexmatch functions does work. So, for example:

=filter('Sheet 1'!B:B,(regexmatch('Sheet 1'!$D:$D,"Positive Search Term"))*not(regexmatch('Sheet 1'!$D:$D,"Negative Search Term")))
Gorgonian answered 20/11, 2023 at 11:33 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.