Greedy, Non-Greedy, All-Greedy Matching in C# Regex
Asked Answered
D

3

26

How can I get all the matches in the following example:

// Only "abcd" is matched
MatchCollection greedyMatches = Regex.Matches("abcd", @"ab.*");

// Only "ab" is matched
MatchCollection lazyMatches   = Regex.Matches("abcd", @"ab.*?");

// How can I get all matches: "ab", "abc", "abcd"

P.S.: I want to have the all matches in a generic manner. The example above is just an example.

Dibromide answered 9/10, 2010 at 22:48 Comment(2)
What's your use case? We can give you better advice with a bit more information. Right now, this task looks very bizarre, and perhaps ill-advised.Unitive
Hi user359996, thanks for your comments. I agree with what you mentioned. I will contact my client to see if this is necessary or not.Dibromide
N
24

You could use something like:

MatchCollection nonGreedyMatches = Regex.Matches("abcd", @"(((ab)c)d)");

Then you should have three backreferences with ab, abc and abcd.

But, to be honest, this kind of regex doesn't makes too much sense, especially when it gets bigger it becomes unreadable.

Edit:

MatchCollection nonGreedyMatches = Regex.Matches("abcd", @"ab.?");

And you got an error there btw. This can only match ab and abc (read: ab + any (optional) character

Lazy version of:

MatchCollection greedyMatches    = Regex.Matches("abcd", @"ab.*");

is:

MatchCollection nonGreedyMatches    = Regex.Matches("abcd", @"ab.*?");
Nixie answered 9/10, 2010 at 23:6 Comment(2)
Hi Tseng, I agree that this RegEx looks weird, I will contact my client to check if they really need it.Dibromide
I discussed with my client, and they said this is not necessary. Thanks for your reply.Dibromide
P
7

If a solution exists, it probably involves a capturing group and the RightToLeft option:

string s = @"abcd";
Regex r = new Regex(@"(?<=^(ab.*)).*?", RegexOptions.RightToLeft);
foreach (Match m in r.Matches(s))
{
  Console.WriteLine(m.Groups[1].Value);
}

output:

abcd
abc
ab

I say "if" because, while it works for your simple test case, I can't guarantee this trick will help with your real-world problem. RightToLeft mode is one of .NET's more innovative features--offhand, I can't think of another flavor that has anything equivalent to it. The official documentation on it is sparse (to put it mildly), and so far there don't seem to be a lot developers using it and sharing their experiences online. So try it and see what happens.

Peculiarity answered 10/10, 2010 at 1:5 Comment(0)
T
1

You can't get three different results from only one match.

If you want to match only "ab" you can use ab.? or a.{1} (or a lot of other options)
If you want to match only "abc" you can use ab. or a.{2} (or a lot of other options)
If you want to match only "abcd" you can use ab.* or a.{3} (or a lot of other options)

Toplofty answered 9/10, 2010 at 22:50 Comment(1)
Hi Tahbaza and Colin Hebert, thanks for your reply. but I'm wondering if there is a generic way, not only for this specific example.Dibromide

© 2022 - 2024 — McMap. All rights reserved.