How can I chain together filename modifiers in a bash shell?

S

3

10

I understand the modifiers # ## % %%, but I can't figure out if its possible to chain them together as you can in tcsh.

Example in tcsh

set f = /foo/bar/myfile.0076.jpg
echo $f:r:e
--> 0076

echo $f:h:t
--> bar

In bash, I'd like to know how to do something like:

echo ${f%.*#*.}

in one line.

My goal is to be able to manipulate filenames in various ways as and when needed on the command line. I'm not trying to write a script for one specific case. So if there is a way to chain these modifiers, or maybe there's another way, then I'd love to know. Thanks

Stotts answered 28/2, 2011 at 20:25 Comment(0)

S

1

I found a solution that gets pretty close to the simplicity of the tcsh filename modifiers. I wrote 4 functions and put them in .bashrc.

e() # the extension
E() # everything but the extension
t() # the tail - i.e. everything after the last /
T() # everything but the tail (head)

Definitions are at the end.

These functions can accept an argument like so:

f=foo/bar/my_image_file.0076.jpg
e $f
--> jpg
E $f
--> foo/bar/my_image_file.0076

or accept input from a pipe, which is the feature from tcsh that I really wanted:

echo $f|E|e
--> 0076

or of course, a combination:

T $f|t
--> bar

and it just dawned on me it will accept many files through the pipe:

ls foo/bar/
--> my_image_file.0075.jpg  my_image_file.0076.jpg
ls foo/bar/ |E|e
--> 0075
--> 0076

Definitions:

#If there are no args, then assume input comes from a pipe.

function e(){
    if [ $# -ne 0 ]; then
        echo ${1##*.}  
    else
        while read data; do
            echo  ${data##*.}   ; 
        done
    fi
}

function E(){
    if [ $# -ne 0 ]; then
        echo ${1%.*} 
    else
        while read data; do
            echo ${data%.*}
        done
    fi
}

function t(){
    if [ $# -ne 0 ]; then
        echo ${1##*/}  
    else
        while read data; do
            echo  ${data##*/}   ; 
        done
    fi
}

function T(){
    if [ $# -ne 0 ]; then
        echo ${1%/*} 
    else
        while read data; do
            echo ${data%/*}
        done
    fi
}

Stotts answered 1/3, 2011 at 22:23 Comment(3)

+1 - Note that your t() is similar to basename and your T() is similar to dirname. I'll post my version of those as edits to my answer. They handle some edge cases in the way that the actual utilities do. – Safety 2/3, 2011 at 0:10

I think your robust handling of edge cases combined with the ability to read input from a pipe (my answer), would make the ideal filename manipulation tools in bash. Way better than what's available by default. – Stotts 3/3, 2011 at 13:58

FYI, using the function keyword makes your code less portable than it would be without it -- just T() { with no function preceding is (the opening of) a POSIX-compliant function declaration, and will work even with ash or dash. – Snake 5/7, 2017 at 14:32

V

7

In bash, you can nest Parameter Expansions but only effectively in the word part in ${parameter#word}.

For example:

$ var="foo.bar.baz"; echo ${var%.*}
foo.bar
$ var="foo.bar.baz"; echo ${var#foo.bar}
.baz
$ var="foo.bar.baz"; echo ${var#${var%.*}}
.baz

To do what you want with pure parameter expansion, you need to have a temp var like so:

$ var="/foo/bar/myfile.0076.jpg"; tmp=${var#*.}; out=${tmp%.*}; echo $out
0076

However, if you're willing to use the set builtin then you could actually get access to all the fields in one go with some clever use of the search/replace Parameter Expansion like so:

$ var="/foo/bar/myfile.0076.jpg"; set -- ${var//[.\/]/ }; echo $4
0076

Vasily answered 28/2, 2011 at 20:32 Comment(5)

I see. So, as the word part is always deleted from the beginning or end of the parameter, modifying the word part can only modify how much is deleted. It can never strip a bit from the beginning and a bit from the end in one line. – Stotts 28/2, 2011 at 20:47

Right, you are essentially locked in to the left-most parameter expansion (# in my example). If bash allowed you to get the length of a parameter expanded variable in one go, like ${#var#foo.bar} (doesn't work) then you could potentially do what you want via the offset parameter expansion ${parameter:offset:length} – Vasily 28/2, 2011 at 20:55

Yes - I see there are many modifiers that make it seem possible with a bit of effort, but it just doesn't quite get there. Maybe I'll try to brush up on sed and use pipes. Thanks for your help. – Stotts 28/2, 2011 at 21:19

@Julian set or even read would be faster than calling sed because that is an external binary and would cause bash to fork off a new process. – Vasily 28/2, 2011 at 23:0

thanks @SiegeX. I found a solution to get some tcsh style chained modifiers working. And since seeing this I edited it toi avoid sed. I'll post as an answer. – Stotts 1/3, 2011 at 21:48

S

3

One way you can do what you're trying to achieve is to use Bash's regexes (version 3.2 and later).

f=/foo/bar/myfile.0076.jpg
pattern='/([^/]*)/[^/]*\.([0-9]*)\.'
[[ $f =~ $pattern ]]
echo ${BASH_REMATCH[1]}    # bar
echo ${BASH_REMATCH[2]}    # 0076

You can apply the brace expansion operators in sequence:

f=/foo/bar/myfile.0076.jpg
r=${f%.*}  # remove the extension
r=${r#*.}  # remove the part before the first (now only) dot
echo $r    # 0076
r=${f%/*}  # similar, but use slash instead of dot
r=${r##*/}
echo $r    # bar

Another way is to combine brace expansion with extended globs:

shopt -s extglob
f=/foo/bar/myfile.0076.jpg
r=${f/%*([^0-9])}    # remove the non-digits from the beginning
r=${r/#*([^0-9])}    # remove the non-digits from the end
echo $r              # 0076
r=${f/#*(\/*([^\/])\/)}    # remove the first two slashes and what's between them
r=${r/%\/*(?)}             # remove the last slash and everything after it
echo $r                    # bar

Edit:

Here are my Bash functions that do basename and dirname. They handle edge cases in a way similar to those utilities.

bn ()
{
    [[ $1 == / ]] && echo / || echo "${1##*/}"
}

dn ()
{
    [[ -z ${1%/*} ]] && echo / || {
        [[ $1 == .. ]] && echo . || echo "${1%/*}"
    }
}

Safety answered 1/3, 2011 at 1:28 Comment(1)

Thanks for these. It seems bash is very powerful compared to tcsh, but for modifying filenames its not the kind of syntax I could use in an ad hoc way on the command line every day. Fortunately, you can wrap up useful regexes such as those you suggested, in functions and put them in bashrc. – Stotts 1/3, 2011 at 22:44

S

1