How do a{n}? and a{n} differ?
Asked Answered
U

2

12

I'm trying to understand the following regular expression quantifier (a is just an exemplary token here):

a{n}?

How does the question mark affect the match of the above expression? And how does it differ from the following?

a{n}

I would have expected the pattern aa{1}?a to match both aaa and aa for example. While it matches aaa, aa is not a match. The pattern a(a{1})?a does match both, so the parentheses do make a difference here.


Note: The msdn article Quantifiers in Regular Expressions states for both:

The {n} quantifier matches the preceding element exactly n times, where n is any integer.

For {n}?, it adds the following, not overly helpful part:

It is the lazy counterpart of the greedy quantifier {n}+.

Unaware answered 1/8, 2013 at 23:2 Comment(7)
What does this have to do with C#?Indetermination
@AustinHenley may be he means it in context of C# Regex only.Holmes
@AustinHenley Different languages have different Regex flavors. As far as I know, the {n}? quantifier is not part of the POSIX standard, so I added the language I'm working with to avoid ambiguities.Unaware
It almost seems like whoever wrote that article wasn't thinking about what they were writing. How could there be a greedy vs lazy version of matching exactly n times?Eulaeulachon
@Eulaeulachon I would understand the thinking behind an optional version (match it either exactly n times or not at all), which can be expressed by the pattern a(a{1})?a. But you're right, what's a greedy, exactly counted match?Unaware
@Mobstaa: Not in regular expressions. You should read the question before commenting or answering. :-)Echo
This answer has been added to the Stack Overflow Regular Expression FAQ, under "Quantifiers > More on the differences..."Ransell
A
13

Nothing. The article states:

The {n} quantifier matches the preceding element exactly n times, where n is any integer. {n} is a greedy quantifier whose lazy equivalent is {n}?.

The {n}? quantifier matches the preceding element exactly n times, where n is any integer. It is the lazy counterpart of the greedy quantifier {n}+.

Notice the text is exactly the same. Basically, adding ? does not change the behavior of the quantifier. It appears that .NET's regular expression engine supports {n}? as a alternative to {n}.


Interestingly, this article does appear to contain an error:

The {n,} quantifier matches the preceding element at least n times, where n is any integer. {n,} is a greedy quantifier whose lazy equivalent is {n}?.

This is wrong. The lazy equivalent of {n,} is {n,}? which is not the same as {n}?.

UPDATE: Newer version of the article have corrected this error.

Aborigine answered 1/8, 2013 at 23:32 Comment(5)
So it seems the version with the question mark ({n}?) is only implemented (and documented) for completeness because its siblings {n,}? and {n,m}? also exist?Unaware
@MariusSchulz Yes, I'd say so. The article even provides an example using {n}? but that code behaves identically if you replace it with {n}.Aborigine
Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have {n}? problems.Unaware
"only implemented (and documented) for completeness" -- no, it's the other way around ... the spec and the implementation would have to go out of the way to exclude {...}? in the case where there's an exact count, and there's no reason to do that and plenty of reasons not to.Deliberative
"It is the lazy counterpart of the greedy quantifier {n}+" -- that was an error (in implementations that support it, {n}+ is possessive, not merely greedy) but the documentation has been changed and no longer says that.Deliberative
G
1

More a notice than an answer, but good to know, in particular if you project to use a same pattern in different languages or if you decide to use an other regex library with .net.

About:

I would have expected the pattern aa{1}?a to match both aaa and aa for example. While it matches aaa, aa is not a match.

a{n} and a{n}? produce the same result (there are seen as the greedy and non-greedy version, but of a fixed quantifier) with most of the regex engines.

But this is not the case with Oniguruma and Onigmo regex engines. With them a{n}? behaves like (?:a{n})?. Since wrappers for .net exist for these libraries, it is useful to clarify.

The same with ERE (Extended Regular Expressions) used in sed, grep and with dbms.

Gingersnap answered 1/8, 2013 at 23:5 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.