How do I have bash "eat" indentation characters common to all lines in a string?
Asked Answered
L

1

3

I have some multi-line string in a shell variable. All lines of the string have an unknown indentation level of at least a few white-space characters (8 spaces in my example, but can be arbitrary). Let's look at this example string for instance:

        I am at the root indentation level (8 spaces).
        I am at the root indentation level, too.
            I am one level deeper
            Am too
        I am at the root again
                I am even two levels deeper
                    three
                two
            one
        common
        common

What I want is a Bash function or command to strip away the common level of indentation (8 spaces here) so I get this:

I am at the root indentation level (8 spaces).
I am at the root indentation level, too.
    I am one level deeper
    Am too
I am at the root again
        I am even two levels deeper
            three
        two
    one
common
common

It can be assumed that the first line of this string is always at this common indentation level. What is the easiest way to do this? Ideally it should work, when reading the string line by line.

Lupien answered 20/11, 2015 at 22:20 Comment(1)
"Ideally it should work, when reading the string line by line." Of course you would need to read all lines before you are able to determine a "common indentation", unless the first line always has the exact common indentation.Creon
C
8

You can use awk:

awk 'NR==1 && match($0, /^ +/){n=RLENGTH} {sub("^ {"n"}", "")} 1' file
I am at the root indentation level (8 spaces).
I am at the root indentation level, too.
    I am one level deeper
    Am too
I am at the root again
        I am even two levels deeper
            three
        two
    one
common
common

For the 1st record (NR==1) we match spaces at start (match($0, /^ +/)) and store the length of the match (RLENGTH) into a variable n.

Then while printing we strip n spaces in gsub("^ {"n"}", "").

Conqueror answered 20/11, 2015 at 22:26 Comment(3)
Slight modification to make sure only spaces are eaten: awk 'NR==1 && match($0, /^ +/){n=RLENGTH} { gsub("^ {"n","n"}","",$0); print $0}'Lupien
here's a slightly different approach. it's longer, but i like it because i think it's clearer about what it's actually doing: awk '(NR == 1){match($0, /^[\t ]*/, arr)} {print(substr($0, length(arr[0]) + 1))}'. also it supports mixed tab/space indents, as long as it matches on each lineSchulte
That's a good one though it will work in gnu awk only due to use of 3rd parameter of matchConqueror

© 2022 - 2024 — McMap. All rights reserved.