Match string with Regex as long as it is not surrounded by parentheses
Asked Answered
N

3

8

I am looking to match a string "Order By XXX" where XXX can be any letter, number, period, comma, space or square bracket. However, I would only like to match this if it is not surrounded by parentheses (parentheses on one side is ok, as long as it it not on both sides). So it should match the part in italics from "", by it should not match anything in

Should match (matched section in italics):

  • Select X from Y order by z
  • Select y = (select top 1 Z from C Order by [ID] desc)

Should not match:

  • Select X from Y (order by z)
  • Select a.a, NTILE(4) OVER (Order by a.b) group by a.c

I have the regex string for matching the order by text: [ ]*order by [\w,.\[\] ]+. However, I am having some trouble getting the lookahead/behind the work properly. Any advice on how to proceed?

Nairobi answered 14/6, 2012 at 14:4 Comment(3)
I can't differentiate between (select top 1 Z from C Order by [ID] desc) and (OVER Order by a.b) with your criteria alone.Lafrance
@Lafrance - yup. misplaced the paren in the second example. Edited.Nairobi
I am not much into regular expressions. I just used some very trivial ones upto now. I found a tool that you can check your regexs . Maybe it can help you as well. here is the link asterworld.com/en/soft/010.htmlSweeting
A
2

Try this:

(?<!\(\s*)order\s+by\s+[\w,.\[\] ]+(?<!\s*\))

When tested in PowerShell:

PS> @(
    'Select X from Y order by z'
    'Select y = (select top 1 Z from C Order by [ID] desc)'
    'Select X from Y (order by z)'
    'Select a.a, NTILE(4) OVER (Order by a.b) group by a.c'
    'Order by 87'
    '(Order by 87)'
    '( Order by 87 )'
    '(Order by 87 )'
    '( Order by 87)'
    'Order by _foo'
) -match '(?<!\(\s*)order\s+by\s+[\w,.\[\] ]+(?<!\s*\))'

Select X from Y order by z
Select y = (select top 1 Z from C Order by [ID] desc)
Order by 87
Order by _foo

PS>
Acetylate answered 14/6, 2012 at 14:37 Comment(2)
Close enough to serve my purposes so I am accepting. However, this does not match on Select a.a, NTILE(4) OVER (Order by a.b group by a.c, when according to the question this should match (opening paren without a matching closing paren).Nairobi
Thanks, Yaakov. I'll have a think about that last 20%. :-)Acetylate
C
0

This works for me, let me know if there are other cases I'm missing:
Regex r = new Regex(@"[^(](order by [^)]+)", RegexOptions.IgnoreCase);

Crandell answered 14/6, 2012 at 14:28 Comment(3)
the character class in the front matches anything that is not an opening parentheses. So if you run it against Select a.a, NTILE(4) OVER (Order by a.b) group by a.c it works (doesn't match). However, if you run it against Select a.a, NTILE(4) OVER Order by a.b) group by a.c (taking away the opening paren before "Order") then it matches "R OVER Order by a.b" - since the R of "Over" is a character that is not an open paren.Nairobi
It seems that this will match the case that the OP said should NOT match.Lafrance
Maybe I got this wrong but what about a multi-step regex: Don't accept on @"(order by [^)]+) OR @"[^(](order by .+) OR @"[^(](order by [^)]+). This way we catch all cases of missing parenthesis.Crandell
T
-1

you can use alternation like this:

\(?(order by [a-z0-9., \[\]]+)(?![a-z0-9., \[\]])(?<!\))|[^(](order by [a-z0-9., \[\]]+)\)

"order by XXX" would be either captured by the first or the second capturing parentheses.

Tananarive answered 14/6, 2012 at 14:53 Comment(2)
The point is that when it is surrounded by parentheses, I don't want it to match.Nairobi
it should only match if it is only surrounded by parentheses on one side or the other but not both.Tananarive

© 2022 - 2024 — McMap. All rights reserved.