I thought that in regular expressions, the "greediness" applies to quantifiers rather than matches as a whole. However, I observe that
grep -E --color=auto 'a+(ab)?' <(printf "aab")
returns aab rather than aab.
The same applies to sed. On the other hand, in pcregrep and other tools, it is really the quantifier that is greedy. Is this a specific behaviour of grep?
N.B. I checked both grep (BSD grep) 2.5.1-FreeBSD and grep (GNU grep) 3.1
echo 'aab' | grep -P 'a+(ab)?'
highlightsaa
whereasecho 'aab' | grep -E 'a+(ab)?'
highlightsaab
meaning it optionalab
matched even though it wasn't required.. I think it is because of longest match wins.. for example,echo 'aab' | grep -E 'a+|a+b'
highlightsaab
because that's the longest match whereasecho 'aab' | grep -P 'a+|a+b'
highlightsaa
because in PCRE, alternation precedence is left to right for matches starting from same location – Julesjuley