Regex: How to capture all iterations in repeated capturing group
Asked Answered
C

2

6

I would expect these lines of C#:

var regex = new Regex("A(bC*)*");
var match = regex.Match("AbCCbbCbCCCCbbb");
var groups = match.Groups;

to return something like:

["AbCCbbCbCCCCbbb", "A", "bCC", "b", "bC", "bCCC", "b", "b", "b"]

but instead it returns only the last captured match:

["AbCCbbCbCCCCbbb", "b"]

Here Regex101 also displays the following as a warning:

A repeated capturing group will only capture the last iteration. Put a capturing group around the repeated group to capture all iterations or use a non-capturing group instead if you're not interested in the data

How should I change my regex pattern?

Clem answered 4/10, 2016 at 6:57 Comment(3)
Regex101 does not support .NET regex flavor.Flavopurpurin
Put a capturing group around the repeated group to capture all iterationsPearliepearline
@Groo I did, but it didn't work.Clem
C
2

If you want to also capture A, just wrap it with parentheses: new Regex("(A)(bC*)*"). See the regex demo.

enter image description here

Then, collect all the values you have got inside CaptureCollection:

var regex = new Regex("(A)(bC*)*");
var match = regex.Matches("AbCCbbCbCCCCbbb")
     .Cast<Match>()
     .SelectMany(x => x.Groups.Cast<Group>()
          .SelectMany(v => v.Captures
              .Cast<Capture>()
              .Select(t => t.Value)
          )
     )
     .ToList();
 foreach (var s in match)
     Console.WriteLine(s);

See the C# demo

Cuckoo answered 4/10, 2016 at 7:12 Comment(0)
B
1

Maybe try this:

A|b(C+)?

Tested in Notepad++

Edit: If you want this pattern with groups:

(A)|(b(C+)?)
Bonanza answered 4/10, 2016 at 7:1 Comment(3)
It works in regex101 with javascript flavor, and notepad++ but it doesn't work in Visual Studio, C#. It returns {"A", "A", "", ""}Clem
Note: This is a completely different pattern. For example, the pattern would match against "bCCbbCbCCCCbbb" while the OG's regex would not.Pastel
@WiktorStribiżew Methinks it's rather do not use .NET if you need a decent regex parser. You do know that apart from some language specific extensions (none are in use here), the capture behavior of a regex is standardized and works the same everywhere. If it does not work that way in .NET, then it's simply broken. Capturing is not a part of "flavor", languages cannot choose freely on this aspect. They can extend the standards but for what the questioner wants to do, no extension is required).Intaglio

© 2022 - 2024 — McMap. All rights reserved.