Does lookbehind work in sed?
Asked Answered
I

3

50

I created a test using grep but it does not work in sed.

grep -P '(?<=foo)bar' file.txt

This works correctly by returning bar.

sed 's/(?<=foo)bar/test/g' file.txt

I was expecting footest as output, but it did not work.

Imprint answered 29/9, 2014 at 23:2 Comment(2)
sed does not support lookaround assertions.Segregate
For what it's worth, grep -P is also a nonstandard extension, though typically available on Linux (but not other platforms).Metagenesis
S
49

GNU sed does not have support for lookaround assertions. You could use a more powerful language such as Perl or possibly experiment with ssed which supports Perl-style regular expressions.

perl -pe 's/(?<=foo)bar/test/g' file.txt
Segregate answered 29/9, 2014 at 23:6 Comment(1)
The text accompanying your solution doesn't quite make sense since Perl doesn't support PCRE either (at least not natively).Askance
G
51

Note that most of the time you can avoid a lookbehind (or a lookahead) using a capture group and a backreference in the replacement string:

sed 's/\(foo\)bar/\1test/g' file.txt

Simulating a negative lookbehind is more subtile and needs several substitutions to protect the substring you want to avoid. Example for (?<!foo)bar:

sed 's/#/##/g;s/foobar/foob#ar/g;s/bar/test/g;s/foob#ar/foobar/g;s/##/#/g' file.txt
  • choose an escape character and repeat it (for example # => ##).
  • include this character in the substring you want to protect (foobar here, => foob#ar or ba => b#a).
  • make your replacement.
  • replace foob#ar with foobar (or b#a with ba).
  • replace ## with #.

Obviously, you can also describe all that isn't foo before bar in a capture group:

sed -E 's/(^.{0,2}|[^f]..|[^o].?)bar/\1test/g' file.txt

But it will quickly become tedious with more characters.

Ganglion answered 29/9, 2014 at 23:27 Comment(5)
But this does not work for "negative" lookbehind, e.g. you want "bar" NOT preceded by "foo" to be replaced with "test", what would be done (if it worked) with /(?<!foo)bar/test/. Has anyone a solution to this? (I want to use uniq on the 5th field of an SQL file but preceding fields may contain spaces so I have no better idea than to replace all spaces NOT between ...', '... by "_"...)Ionian
@Max: 1) choose an escape character and repeat it (for example # => ##). 2) include this character in the substring you want to protect (foobar here, => foob#ar). 3) make your replacement. 4) replace foob#ar with foobar. 5) replace ## with #. Example with sed: sed 's/#/##/g;s/foobar/foob#ar/g;s/bar/test/g;s/foob#ar/foobar/g;s/##/#/g' <<<'abc foobar # foob#ar foo bar'Ganglion
OK, yes that works, essentially you remove what you want to protect (maybe foobar=>-#- (instead foob#ar) would be clearer), then you find & replace all others, then you put the "protected" ones backIonian
@Max: foobar => -#-: if you want (and only if you have replaced all # with ## before).Ganglion
@MichaelChirico: Thanks Chirico (and the sorceress) for your edit.Ganglion
S
49

GNU sed does not have support for lookaround assertions. You could use a more powerful language such as Perl or possibly experiment with ssed which supports Perl-style regular expressions.

perl -pe 's/(?<=foo)bar/test/g' file.txt
Segregate answered 29/9, 2014 at 23:6 Comment(1)
The text accompanying your solution doesn't quite make sense since Perl doesn't support PCRE either (at least not natively).Askance
D
1

sed doesn't support lookarounds but choose (I'm the author) does. It uses PCRE2 syntax.

For example:

$ echo "hello bar foobar" | choose -r --sed '(?<=foo)bar' --replace test
hello bar footest

It's speed is comparable to sed.

Deliberate answered 1/9, 2023 at 20:19 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.