Iterate over a list of quoted strings
Asked Answered
T

6

5

I'm trying to run a for loop over a list of strings where some of them are quoted and others are not like so:

STRING='foo "bar_no_space" "baz with space"'
for item in $STRING; do
    echo "$item"
done

Expected result:

foo
bar_no_space
baz with space

Actual result:

foo
"bar_no_space"
"baz
with
space"

I can achieve the expected result by running the following command:

bash -c 'for item in '"$STRING"'; do echo "$item"; done;'

I would like to do this without spawning a new bash process or using eval because I do not want to take the risk of having random commands executed.

Please note that I do not control the definition of the STRING variable, I receive it through an environment variable. So I can't write something like:

array=(foo "bar_no_space" "baz with space")
for item in "${array[@]}"; do
    echo "$item"
done

If it helps, what I am actually trying to do is split the string as a list of arguments that I can pass to another command.

I have:

STRING='foo "bar_no_space" "baz with space"'

And I want to run:

my-command --arg foo --arg "bar_no_space" --arg "baz with space"
Twombly answered 23/11, 2017 at 22:58 Comment(3)
STRING is not a list of quoted strings; it is a single string. In that string, quotes have no more meaning than any other character.Steere
Note that bash -c '...' is no safer than eval; you are still executing arbitrary code.Steere
@Steere I know bash -c '...' will execute the code. Basically I was hoping bash would expose a way to parse the string the way it does internally (because it must do it somehow) but without all the other interpreting features. I guess there is no such thing and I must use a parsing approach outside of bash. I don't really understand all the negative votes though. Are people juste voting down because there is no answer ?Twombly
F
3

Use an array instead of a normal variable.

arr=(foo "bar_no_space" "baz with space")

To print the values:

print '%s\n' "${arr[@]}"

And to call your command:

my-command --arg "${arr[0]}" --arg "${arr[1]}" --arg "{$arr[2]}"
Fairbanks answered 23/11, 2017 at 23:21 Comment(3)
I do not control the definition of STRING, I receive it through an environment variable.Twombly
@Twombly Then whoever made the decision to pass a list of arbitrary strings in a single string made a mistake. The design is broken.Steere
Giving a path as a parameter to a bash script describes exactly this scenario. The design is not brokenCalgary
W
2

Solved: xargs + subshell

A few years late to the party, but...

Malicious Input:

SSH_ORIGINAL_COMMAND='echo "hello world" foo '"'"'bar'"'"'; sudo ls -lah /; say -v Ting-Ting "evil cackle"'

Note: I originally had an rm -rf in there, but then I realized that would be a recipe for disaster when testing variations of the script.

Converted perfectly into safe args:

# DO NOT put IFS= on its own line
IFS=$'\r\n' GLOBIGNORE='*' args=($(echo "$SSH_ORIGINAL_COMMAND" \
  | xargs bash -c 'for arg in "$@"; do echo "$arg"; done'))
echo "${args[@]}"

See that you can indeed pass these arguments just like $@:

for arg in "${args[@]}"
do
  echo "$arg"
done

Output:

hello world
foo
bar;
sudo
rm
-rf
/;
say
-v
Ting-Ting
evil cackle

I'm too embarrassed to say how much time I spent researching this to figure it out, but once you get the itch... y'know?

Defeating xargs

It is possible to fool xargs by providing escaped quotes:

SSH_ORIGINAL_COMMAND='\"hello world\"'

This can make a literal quote part of the output:

"hello
world"

Or it can cause an error:

SSH_ORIGINAL_COMMAND='\"hello world"'
xargs: unmatched double quote; by default quotes are special to xargs unless you use the -0 option

In either case, it doesn't enable arbitrary execution of code - the parameters are still escaped.

Washedup answered 2/6, 2019 at 5:34 Comment(0)
W
1

Pure bash parser

Here's a quoted-string parser written in pure bash (what terrible fun)!

Caveat: just like the xargs example above, this errors in the case of an escaped quoted.

Usage

MY_ARGS="foo 'bar baz' qux * "'$(dangerous)'" sudo ls -lah"

# Create array from multi-line string
IFS=$'\r\n' GLOBIGNORE='*' args=($(parseargs "$MY_ARGS"))

# Show each of the arguments array
for arg in "${args[@]}"; do
    echo "$arg"
done

Output:

$@: foo bar baz qux *
foo
bar baz
qux
*

Parse Argument Function

Literally going character-by-character and adding to the current string, or adding to the array.

set -u
set -e

# ParseArgs will parse a string that contains quoted strings the same as bash does
# (same as most other *nix shells do). This is secure in the sense that it doesn't do any
# executing or interpreting. However, it also doesn't do any escaping, so you shouldn't pass
# these strings to shells without escaping them.
parseargs() {
    notquote="-"
    str=$1
    declare -a args=()
    s=""

    # Strip leading space, then trailing space, then end with space.
    str="${str## }"
    str="${str%% }"
    str+=" "

    last_quote="${notquote}"
    is_space=""
    n=$(( ${#str} - 1 ))

    for ((i=0;i<=$n;i+=1)); do
        c="${str:$i:1}"

        # If we're ending a quote, break out and skip this character
        if [ "$c" == "$last_quote" ]; then
            last_quote=$notquote
            continue
        fi

        # If we're in a quote, count this character
        if [ "$last_quote" != "$notquote" ]; then
            s+=$c
            continue
        fi

        # If we encounter a quote, enter it and skip this character
        if [ "$c" == "'" ] || [ "$c" == '"' ]; then
            is_space=""
            last_quote=$c
            continue
        fi

        # If it's a space, store the string
        re="[[:space:]]+" # must be used as a var, not a literal
        if [[ $c =~ $re ]]; then
            if [ "0" == "$i" ] || [ -n "$is_space" ]; then
                echo continue $i $is_space
                continue
            fi
            is_space="true"
            args+=("$s")
            s=""
            continue
        fi

        is_space=""
        s+="$c"
    done

    if [ "$last_quote" != "$notquote" ]; then
        >&2 echo "error: quote not terminated"
        return 1
    fi

    for arg in "${args[@]}"; do
        echo "$arg"
    done
    return 0
}

I may or may not keep this updated at:

Seems like a rather stupid thing to do... but I had the itch... oh well.

Washedup answered 2/6, 2019 at 8:43 Comment(1)
OMG... :) +1 for the perverted fun... I can feel/share your motivation, but there's no way I'd be willing to even look at that code. Not because of you, of course, but because of bash. (Actually I just did... Couldn't resist. It's not even that horrid...)Calica
A
1

Here is a way without an array of strings or other difficulties (but with bash calling and eval):

STRING='foo "bar_no_space" "baz with space"'
eval "bash -c 'while [ -n \"\$1\" ]; do echo \$1; shift; done' -- $STRING"

Output:

foo
bar_no_space
baz with space

If You want to do with the strings something more difficult then just echo You can split Your script:

split_qstrings.sh

#!/bin/bash
while [ -n "$1" ]
do
    echo "$1"
    shift
done

Another part with more difficult processing (capitalizing of a characters for example):

STRING='foo "bar_no_space" "baz with space"'

eval "split_qstrings.sh $STRING" | while read line 
do
   echo "$line" | sed 's/a/A/g'
done

Output:

foo
bAr_no_spAce
bAz with spAce
Ahn answered 7/10, 2021 at 9:58 Comment(1)
In case it's not clear to others: a downside of this approach (using eval) is that if the input string is malicious, it can run arbitrary commands on your system. Eg: the string could have ; followed by extra commands, which eval will run.Hildegardhildegarde
Y
0

Can you try something like this:

sh-4.4$ echo $string                                                                                                                                                                
foo "bar_no_space" "baz with space"                                                                                                                                                 
sh-4.4$ echo $string|awk 'BEGIN{FS="\""}{for(i=1;i<NF;i++)print $i}'|sed '/^ $/d'                                                                                                   
foo                                                                                                                                                                                 
bar_no_space                                                                                                                                                                        
baz with space                                                                                                                                                                      
Yuriyuria answered 24/11, 2017 at 3:50 Comment(2)
That won't work in general: string='foo "bar_with_\"" "baz with space"'Steere
@Steere not tried in a general string.i took the one he provided as an example string . Thanks for pointing it outYuriyuria
H
0

I know your question is about Bash, but since it's often available in the same places, you might look at Perl's built-in Text::ParseWords module to do the heavy lifting, sending the results back to Bash.

Eg:

#!/usr/bin/env bash

STRING='foo "bar_no_space" "baz with space"'
readarray -t arr < <( perl -MText::ParseWords -e '$,="\n"; print shellwords(@ARGV),"";' "$STRING" )

for item in "${arr[@]}"; do
    echo "$item"
done

# prints:
#   foo
#   bar_no_space
#   baz with space

As written, that uses newlines as the delimiter, but you could do whatever you want. In particular, if your input strings contained raw newlines themselves (inside quotes, for instance), you could use a NUL separator or something else pretty easily.

In fact, I'll do that now:

#!/usr/bin/env bash

STRING='foo "bar with space" "baz\
with\
newline"'
readarray -t -d '' arr < <( perl -MText::ParseWords -e '$,="\0"; print shellwords(@ARGV),"";' "$STRING" )

for item in "${arr[@]}"; do
    echo "item: $item"
done

# prints:
#   item: foo
#   item: bar with space
#   item: baz
#   with
#   newline

Of course, if your input string has raw NUL characters, then it's still potentially a problem (though not a security one).

Also beware: if your string has non-matching quote characters, you could get unexpected results.

Even though this is not pure Bash code, I think it is in the spirit of shell scripts to offload the "hard work" to external programs better suited to the task (:

Hildegardhildegarde answered 8/11, 2023 at 17:44 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.