Explanation of difference between GNU sed and BSD sed
Asked Answered
S

1

6

I wrote the following command

echo -en 'uno\ndue\n' | sed -E 's/^.*(uno|$)/\1/'

expecting the following output

uno

This is indeed the case with my GNU Sed 4.8.

However, I've verified that BSD Sed outputs



Why is that the case?

Sherylsheryle answered 11/4, 2022 at 21:21 Comment(6)
I'm not sure I would have the same expectations. Regexes are greedy. Because of that, the .* should always match the entire line, so that inside the parens matches the end of line.Indistinguishable
This answer goes in-depth about the differences between various sed implementations.Ecology
Just a guess here: it looks like the GNU ERE regex engine is willing to backtrack farther to find the longer match ("uno"), whereas the BSD regex engine is happy enough to let .* consume the whole line, and then capture ($) the empty string.Alkalify
@TimRoberts, I'm pretty sure Mastering Regular Expressions gives examples of engines where alternation is not greedy nor lazy, but ordered.Sherylsheryle
perl gives empty lines too. I think this depends on implementation, and as linked above, there are plenty of differences between GNU and BSDEolith
@TimRoberts quantifiers in BRE/ERE are not exactly greedy though, longest match wins. For example, echo 'foo123312baz' | grep -oE 'o[123]+(12baz)?' gives o123312baz whereas you'll get o123312 with greedy quantifiers like those in PCREEolith
G
7

I'd say that BSD's sed is POSIX-compatible only. POSIX specifies support only for basic regular expressions, which have many limitations (e.g., no support for | (alternation) at all, no direct support for + and ?) and different escaping requirements.

BSD sed is default one on MacOS so very first thing on a new system is to get GNU-compatible sed: brew install gsed.

Gravelly answered 19/4, 2022 at 12:44 Comment(1)
BSD sed is not POSIX-compatible only. BSD sed does support extended regular expressions.Underclassman

© 2022 - 2024 — McMap. All rights reserved.