Bash arbitrary glob pattern (with spaces) in for loop
Asked Answered
G

2

6

Is there any way to reliably use an arbitrary globbing pattern that's stored in a variable? I'm having difficulty if the pattern contains both spaces and metacharacters. Here's what I mean. If I have a pattern stored in a variable without spaces, things seem to work just fine:

<prompt> touch aa.{1,2,3} "a b".{1,2,3}
<prompt> p="aa.?"
<prompt> for f in ${p} ; do echo "|$f|" ; done
|aa.1|
|aa.2|
|aa.3|
<prompt> declare -a A=($p) ; for f in "${A[@]}" ; do echo "|$f|" ; done
|aa.1|
|aa.2|
|aa.3|

However, as soon as I throw a space in the pattern, things become untenable:

<prompt> p="a b.?"
<prompt> for f in ${p} ; do echo "|$f|" ; done
|a|
|b.?|
<prompt> declare -a A=($p) ; for f in "${A[@]}" ; do echo "|$f|" ; done
|a|
|b.?|
<prompt> for f in "${p}" ; do echo "|$f|" ; done
|a b.?|
<prompt> for f in $(printf "%q" "$p") ; do echo "|$f|" ; done
|a\|
|b.\?|

Obviously, if I know the pattern in advance, I can manually escape it:

<prompt> for f in a\ b.* ; do echo "|$f|" ; done
|a b.1|
|a b.2|
|a b.3|

The problem is, I'm writing a script where I don't know the pattern in advance. Is there any way to reliably make bash treat the contents of a variable as a globbing pattern, without resorting to some sort of eval trickery?

Gasper answered 24/10, 2014 at 18:34 Comment(3)
I do not believe you can get globbing without word-splitting.Eskisehir
The root problem is that word-splitting occurs before pathname expansion (aka globbing), and there's no way to alter the order of expansions (much to the chagrin of people who want {1..$n} to work). You can, as in John1024's answer, disable word-splitting.Eyra
If you don't know the pattern in advance then be very careful about using eval. You never know when Little Bobby Tables is going to turn up. :)Thesda
O
8

You need to turn off word-splitting. To recap, this doesn't work:

$ p="a b.?"
$ for f in ${p} ; do echo "|$f|" ; done
|a|
|b.?|

This, however, does:

$ ( IFS=; for f in ${p} ; do echo "|$f|" ; done )
|a b.1|
|a b.2|
|a b.3|

IFS is the shell's "Internal Field Separator." It is normally set to a space, a tab, and a new line character. It is used for word splitting after variable expansion. Setting IFS to empty stops word splitting and, thereby, allows the glob to work.

Array example

The same applies to the array examples:

$ declare -a A=($p) ; for f in "${A[@]}" ; do echo "|$f|" ; done
|a|
|b.?|
$ ( IFS=; declare -a A=($p) ; for f in "${A[@]}" ; do echo "|$f|" ; done )
|a b.1|
|a b.2|
|a b.3|

Making sure that IFS gets returned to its normal value

In the examples above, I put the IFS assignment inside a subshell. Although not necessary, the advantage of that is that IFS returns automatically to its prior value as soon as the subshell terminates. If subshells are not appropriate for your application, here is another approach:

$ oldIFS=$IFS; IFS=; for f in ${p} ; do echo "|$f|" ; done; IFS=$oldIFS
|a b.1|
|a b.2|
|a b.3|

Matching patterns with shell-active characters

Suppose that we have files that have a literal * in their names:

$ touch ab.{1,2,3} 'a*b'.{1,2,3}
$ ls
a*b.1  ab.1  a*b.2  ab.2  a*b.3  ab.3

And, suppose that we want to match that star. Since we want the star to be treated literally, we must escape it:

$ p='a\*b.?'
$ ( IFS=; for f in ${p} ; do echo "|$f|" ; done )
|a*b.1|
|a*b.2|
|a*b.3|

Because the ? is not escaped, it is treated as a wildcard character. Because the * is escaped, it matches only a literal *.

Owensby answered 24/10, 2014 at 18:46 Comment(1)
Bah, I forgot about trying the IFS. (I use it all the time in other scripting, just didn't think to apply it to this problem). Thanks @John1024. The one problem I have with this is that the IFS is set to something goofy for the duration of the for loop. But I'm thinking I can expand the glob into an array, and then loop through the array. I'll try some things out.Gasper
N
0

The pattern used in

p="a b.?"

is not correct, which is clear if you use it directly:

A=( a b.? )

As is stated in the question, the correct pattern is

a\ b.?

so the correct variable assignment is

p='a\ b.?'

Where are the patterns coming from? Can they be corrected at source? For instance, if the pattern is being created by adding '.?' to a base, you can use 'printf' to do the required quoting:

base='a b'
printf -v p '%q.?' "$base"

'set' then shows:

p='a\ b.?'

Unfortunately, word splitting still causes it to fail if you try

A=( $p )

'set' shows:

A=([0]="a\\" [1]="b.?")

One way to work around the problem is to use 'eval':

eval "A=( $p )"

'set' then shows:

A=([0]="a b.1" [1]="a b.2" [2]="a b.3")

That is "eval trickery", but it's not clearly any worse than the IFS trickery previously described. Also, the IFS trickery will not help if the files to be matched have globbing metacharacters in their names. What do you do, for instance, if you've got the files created with

touch ab.{1,2,3} 'a*b'.{1,2,3}

and you want to match the ones whose base is 'a*b' ? No amount of IFS trickery will make it possible to match correctly with the variable p if it is set with

p="a*b.?"

After

base='a*b'
printf -v p '%q.?' "$base"

'set' shows:

p='a\*b.?'

and after

eval "A=( $p )"

it shows:

A=([0]="a*b.1" [1]="a*b.2" [2]="a*b.3")

I consider use of 'eval' to be a last resort, but in this case I can't think of a better option, and it's perfectly safe.

Neaten answered 24/10, 2014 at 23:26 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.