How to replace spaces and slash in string in bash?

Asked 28/2, 2014 at 23:22 Answered 26/5, 2014 at 14:11

Solved regex bash optimization scripting tr

Giving the string:

foo='Hello     \    
World! \  
x

we are friends

here we are'

Supose there are also tab characters mixed with spaces after or before the \ character. I want to replace the spaces, tabs and the slash by only a space. I tried with:

echo "$foo" | tr "[\s\t]\\\[\s\t]\n\[\s\t]" " " | tr -s " "

Returns:

Hello World! x we are friend here we are

And the result I need is:

Hello World! x

we are friends

here we are

Some idea, tip or trick to do it? Could I get the result I want in only a command?

Atmospheric answered 28/2, 2014 at 23:22 Comment(1)

Note: tr deals in lists of characters, but you appear to be trying to pass it a regular expression (\s*\\\s*\n\s*). This may appear to work at first, but isn't actually doing what you expect; in this case, (partly because of quirks in backslash parsing in double quotes), it'll replace "\", "s", "*", and newline characters with spaces. – Menefee 1/3, 2014 at 7:23

The following one-liner gives the desired result:

echo "$foo" | tr '\n' '\r' | sed 's,\s*\\\s*, ,g' | tr '\r' '\n'
Hello World!

we are friends

here we are

Explanation:

tr '\n' '\r' removes newlines from the input to avoid special sed behavior for newlines.

sed 's,\s*\\\s*, ,g' converts whitespaces with an embedded \ into one space.

tr '\r' '\n' puts back the unchanged newlines.

Helicoid answered 6/3, 2014 at 10:15 Comment(0)

Try as below:

#!/bin/bash

foo="Hello     \
World!"

echo $foo | sed 's/[\s*,\\]//g'

Euphrates answered 28/2, 2014 at 23:49 Comment(4)

The whitespace coalescing due to the unquoted $foo after the echo is working some of its magic here. Note sure if this was intended as part of the solution – Coenzyme 28/2, 2014 at 23:59

Thank You Sabuj, It's my first answer :) – Euphrates 28/2, 2014 at 23:59

A bracket expression denotes a set of characters, and most regex-special chars have no special meaning in brackets. The expression [\s*,\\] will match an s (in some sed's), a star, a comma, or a backslash. You probably want sed 's/[ \\]\+/ /g' – Gooch 1/3, 2014 at 0:5

It prints two spaces between "Hello" and "World!". – Atmospheric 1/3, 2014 at 0:18

If you just want to print the output as given, you just need to:

foo='Hello     \
World!'
bar=$(tr -d '\\' <<<"$foo")
echo $bar    # unquoted!

Hello World!

If you want to squeeze the whitespace as it's being stored in the variable, then one of:

bar=$(tr -d '\\' <<<"$foo" | tr -s '[:space:]' " ")
bar=$(perl -0777 -pe 's/\\$//mg; s/\s+/ /g' <<<"$foo")

The advantage of the perl version is that it only removes line continuation backslashes (at the end of the line).

Note that when you use double quotes, the shell takes care of line continuations (proper ones with no whitespace after the slash:

$ foo="Hello    \
World"
$ echo "$foo"
Hello    World

So at this point, it's too late.

If you use single quotes, the shell won't interpret line continuations, and

$ foo='Hello     \
World!

here we are'
$ echo "$foo"
Hello     \
World!

here we are
$ echo "$foo" | perl -0777 -pe 's/(\s*\\\s*\n\s*)/ /sg'
Hello World!

here we are

Gooch answered 1/3, 2014 at 0:1 Comment(1)

Warning: using unquoted variable expansions to coalesce whitespace isn't safe, because it may also expand wildcards. – Menefee 1/3, 2014 at 7:24

foo='Hello     \    
World! \  
x

we are friends

here we are'

If you use double quotes then the shell will interpret the \ as a line continuation character. Switching to single quotes preserves the literal backslash.

I've added an backslash after World! to test multiple backslash lines in a row.

sed -r ':s; s/( )? *\\ *$/\1/; Te; N; bs; :e; s/\n *//g' <<< "$foo"

Output:

Hello World! x

we are friends

here we are

What's this doing? In pseudo-code you might read this as:

while (s/( )? *\\ *$/\1/) {  # While there's a backslash to remove, remove it...
    N                        # ...and concatenate the next line.
}

s/\n *//g                    # Remove all the newlines.

In detail, here's what it does:

:s is a branch labeled s for "start".
s/( )? *\\ *$/\1/ replaces a backslash and its surrounding whitespace. It leaves one space if there was one by capturing ( )?.
If the previous substitution failed, Te jumps to label e.
N concatenates the following line, including the newline \n.
bs jumps back to the start. This is so we can handle multiple consecutive lines with backslashes.
:e is a branch labeled e for "end".
s/\n *//g removes all the extra newlines from step #4. It also removes leading spaces from following line.

Note that T is a GNU extension. If you need this to work in another version of sed, you'll need to use t instead. That'll probably take an extra b label or two.

Olivette answered 5/3, 2014 at 16:35 Comment(4)

Thanks for the explanation, but could you tell me why this doesn't works? pastebin.com/g2iPnyvG Thanks! :) – Atmospheric 5/3, 2014 at 21:46

@JohnDoe Updated my answer to also remove leading spaces from the line following a backslash. I changed the last substitution from s/\n//g to s/\n *//g. – Olivette 5/3, 2014 at 22:23

I replaced all the ' *' by '[ \t]' to include the tabs too, but it does not works. To test the new case, just add tabs and spaces randomly in the lines after the character '\'. Thanks for all, Jhon. :) – Atmospheric 6/3, 2014 at 7:59

Could you help me with the last question posted in the previuscomment, please? – Atmospheric 10/3, 2014 at 7:24

You could use a read loop to get the desired output.

arr=()
i=0
while read line; do
    ((i++))
    [ $i -le 3 ] && arr+=($line)
    if [ $i -eq 3 ]; then
        echo ${arr[@]}
    elif [ $i -gt 3 ]; then
        echo $line
    fi
done <<< "$foo"

Necrophilism answered 10/3, 2014 at 3:41 Comment(0)

With awk:

$ echo "$foo"
Hello     \
World! \
x

we are friends

here we are

With trailing newline:

$ echo "$foo" | awk '{gsub(/[[:space:]]*\\[[:space:]]*/," ",$0)}1' RS= FS='\n' ORS='\n\n'
Hello World! x

we are friends

here we are
                                                                                              .

Without trailing newline:

$ echo "$foo" | 
awk '{
  gsub(/[[:space:]]*\\[[:space:]]*/," ",$0)
  a[++i] = $0
}
END {
  for(;j<i;) printf "%s%s", a[++j], (ORS = (j < NR) ? "\n\n" : "\n")
}' RS= FS='\n' 
Hello World! x

we are friends

here we are

Ulund answered 10/3, 2014 at 5:53 Comment(0)

sed is an excellent tool for simple subsitutions on a single line but for anything else just use awk. This uses GNU awk for multi-char RS (with other awks RS='\0' would work for text files that don't contain NUL chars):

$ echo "$foo" | awk -v RS='^$' -v ORS= '{gsub(/\s+\\\s+/," ")}1'
Hello World! x

we are friends

here we are

Anitraaniweta answered 26/5, 2014 at 14:11 Comment(0)

With bashisms such as extended globbing, parameter expansion etc...but it's probably just as ugly

foo='Hello     \    
World!'
shopt -s extglob
echo "${foo/+( )\\*( )$'\n'/ }"
Hello World!

Coenzyme answered 28/2, 2014 at 23:54 Comment(1)

I think the shorter and (slightly) less ugly echo "${foo/+($'\n'| |\\)/ }" is equivalent. – Muslim 1/3, 2014 at 22:58

As I understand, you want to just remove trailing spaces followed by an backslash-escaped newline?

In that case, search with the regex ( ) *\\\n and replace with \1

Seadog answered 5/3, 2014 at 14:53 Comment(2)

Using sed, tr... ? Could you explain the code or give me a little example, please? Thanks! – Atmospheric 5/3, 2014 at 15:34

@JohnDoe: The regex basically finds a single space, followed by any number of spaces (zero-or-many), followed by a backslash and newline. You the replace it with a single space, consuming the other spaces, slash and newline. I don't have a bash environment handy at the moment, but I'll try to give a complete call later. – Seadog 5/3, 2014 at 15:43

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

With trailing newline:

Without trailing newline:

Recommended topics

Hot tags