sed (in bash) works with [ \t] but not with \s?
Asked Answered
M

2

5

I want to search-replace something containing whitespace on a bash command line, and I assumed sed would be the easiest way to go.

Using [ \t] denoting either tab or space, to match the whitespace, works as intended:

echo "abc xyz" | sed "s/[ \t]xyz/123/"
abc123

But using \s instead of [ \t] does not, to my surprise:

echo "abc xyz" | sed "s/\sxyz/123/"
abc xyz

I'm fairly new to bash so I might be missing something trivial, but no matter what I do, I can't get this to work. Using \\s instead of \s, using single quotes (' instead of "), putting the whitespace marker inside square brackets (like [\s] or even [\\s]), nothing seems to help..?

(edit) in case it differs from one sed / bash version to another: I'm working on OS X here.

Additionally, I noticed that when I add a + after the [ \t] whitespace part, to optionally grab multiple space/tab characters if present, it doesn't work anymore either...??

echo "abc xyz" | sed "s/[ \t]+xyz/123/"
abc xyz

(and again, also tried with \+ instead of +, and single quotes instead of double quotes, nothing helps)

Myxomycete answered 4/5, 2015 at 11:19 Comment(4)
Interesting and related: How to match whitespace in sed?. Basically, For POSIX compliance, use the character class [[:space:]] instead of \s, since the latter is a GNU sed extension So you are probably working in a non-GNU sed. Could you provide the output of sed --version?Pompeii
on sed here, it works as expected; as @Pompeii suggest, it is gnu sed indeedExcursion
have you thoroughly read the sed man page on your system? It should document what does and does not work. Type man sed at a command prompt.Trovillion
Ah, thanks, sed --version says illegal option -- - so I guess it's not GNU :)Myxomycete
P
8

As seen in SuperUser's How to match whitespace in sed?:

For POSIX compliance, use the character class [[:space:]] instead of \s, since the latter is a GNU sed extension

So you are probably running a non-GNU sed version, hence \s not working to you.

You have two solutions:

  • To use (space) and \t together, like you were doing.
  • To use [[:space:]].
Pompeii answered 4/5, 2015 at 11:25 Comment(4)
Confirmed, thanks, it works with [[:space:]] However, the + modifer still doesn't work, any ideas? Note that [[:space:]]\{1,999\} does work, but [[:space:]]+ or [[:space:]]\+ do not.Myxomycete
To use + normally you need either to escape it: \+ or to use -r for extended regex: sed -r "s/[[:space:]]+/X/"Pompeii
@Myxomycete the POSIX-compliant syntax for one or more matches Is \{1,\} - anything else is an extension.Caldarium
@TomFenech that is no longer true. the most recent spec defines + when using sed -E (not -r) flag. It even specifies reluctant operators (+?, *?, ??)Siana
V
1
echo 'abc xyz<>abcxyz' | sed 's/[[:space:]]xyz/123/g'
abc123<>abcxyz
echo 'abc xyz<>abcxyz' | sed "s/[[:space:]]xyz/123/g"
abc123<>abcxyz

doesn't work on very old sed version but fine on GNU sed as posix complaint (AIX, ...)

Vera answered 4/5, 2015 at 11:29 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.