Properly handling spaces and quotes in bash completion
Asked Answered
J

5

30

What is the correct/best way of handling spaces and quotes in bash completion?

Here’s a simple example. I have a command called words (e.g., a dictionary lookup program) that takes various words as arguments. The supported ‘words’ may actually contain spaces, and are defined in a file called words.dat:

foo
bar one
bar two

Here’s my first suggested solution:

_find_words()
{
search="$cur"
grep -- "^$search" words.dat
}

_words_complete()
{
local IFS=$'\n'

COMPREPLY=()
cur="${COMP_WORDS[COMP_CWORD]}"

COMPREPLY=( $( compgen -W "$(_find_words)" -- "$cur" ) )

}
complete -F _words_complete words

Typing ‘words f<tab>’ correctly completes the command to ‘words foo ’ (with a trailing space), which is nice, but for ‘words b<tab>’ it suggests ‘words bar ’. The correct completion would be ‘words bar\ ’. And for ‘words "b<tab>’ and ‘words 'b<tab>’ it offers no suggestions.

This last part I have been able to solve. It’s possible to use eval to properly parse the (escaped) characters. However, eval is not fond of missing quotes, so to get everything to work, I had to change the search="$cur" to

search=$(eval echo "$cur" 2>/dev/null ||
eval echo "$cur'" 2>/dev/null ||
eval echo "$cur\"" 2>/dev/null || "")

This actually works. Both ‘words "b<tab>’ and ‘words 'b<tab>’ correctly autocompletes, and if I add a ‘o’ and press <tab> again, it actually completes the word and adds the correct closing quote. However, if I try to complete ‘words b<tab>’ or even ‘words bar\ <tab>’, it is autocompleted to ‘words bar ’ instead of ‘words bar\ ’, and adding for instance ‘one’ would fail when the words program is run.

Now, obviously it is possible to handle this correctly. For instance, the ls command can do it for files namned ‘foo’ ‘bar one’ and ‘bar two’ (though it does have problems with some ways of expressing the filenames when one uses a (valid) combination of both ", ' and various escapes). However, I couldn’t figure out how ls does it by reading the bash completion code.

So, does anybody know of how properly handle this? The actual input quotes need not be preserved; I would be happy with a solution that changes ‘words "b<tab>’, ‘words 'b<tab>’ and ‘words b<tab>’ to ‘words bar\ ’, for instance, (though I would prefer stripping of quotes, like in this example, instead of adding them).

Jeremie answered 17/7, 2009 at 23:15 Comment(0)
H
8

This not too elegant postprocessing solution seems to work for me (GNU bash, version 3.1.17(6)-release (i686-pc-cygwin)). (Unless I didn't test some border case as usual :))

Don't need to eval things, there are only 2 kinds of quotes.

Since compgen doesn't want to escape spaces for us, we will escape them ourselves (only if word didn't start with a quote). This has a side effect of full list (on double tab) having escaped values as well. Not sure if that's good or not, since ls doesn't do it...

EDIT: Fixed to handle single and double qoutes inside the words. Essentially we have to pass 3 unescapings :). First for grep, second for compgen, and last for words command itself when autocompletion is done.

_find_words()
{
    search=$(eval echo "$cur" 2>/dev/null || eval echo "$cur'" 2>/dev/null || eval echo "$cur\"" 2>/dev/null || "")
    grep -- "^$search" words.dat | sed -e "{" -e 's#\\#\\\\#g' -e "s#'#\\\'#g" -e 's#"#\\\"#g' -e "}"
}

_words_complete()
{
    local IFS=$'\n'

    COMPREPLY=()
    local cur="${COMP_WORDS[COMP_CWORD]}"

    COMPREPLY=( $( compgen -W "$(_find_words)" -- "$cur" ) )

    local escaped_single_qoute="'\''"
    local i=0
    for entry in ${COMPREPLY[*]}
    do
        if [[ "${cur:0:1}" == "'" ]] 
        then
            # started with single quote, escaping only other single quotes
            # [']bla'bla"bla\bla bla --> [']bla'\''bla"bla\bla bla
            COMPREPLY[$i]="${entry//\'/${escaped_single_qoute}}" 
        elif [[ "${cur:0:1}" == "\"" ]] 
        then
            # started with double quote, escaping all double quotes and all backslashes
            # ["]bla'bla"bla\bla bla --> ["]bla'bla\"bla\\bla bla
            entry="${entry//\\/\\\\}" 
            COMPREPLY[$i]="${entry//\"/\\\"}" 
        else 
            # no quotes in front, escaping _everything_
            # [ ]bla'bla"bla\bla bla --> [ ]bla\'bla\"bla\\bla\ bla
            entry="${entry//\\/\\\\}" 
            entry="${entry//\'/\'}" 
            entry="${entry//\"/\\\"}" 
            COMPREPLY[$i]="${entry// /\\ }"
        fi
        (( i++ ))
    done
}
Hoon answered 18/7, 2009 at 4:22 Comment(6)
Thanks. This solution works for the original examples, but if I add ‘rock 'n roll’ to words.dat, it fails. My real-life use of the autocompletion actually involves words with apostrophes, and that’s the reason I originally used eval. It’s easy enough (though not very elegant) to fix, by adding an extra ‘search and replace’ to the `for loop, and then add another for loop for strings beginning with '. The only remaining problem, as far as I can see, is that the autocompletion does not advance one cursor position if you have written an entire word, including any closing quotes.Jeremie
Regarding my comment above. It looks like the situation is slightly worse than I thought. Auto-completion inside a word containing apostrophes (e.g., trying to autocomplete ‘rock 'n ro’, either written using escaped spaces and apostrophe, or single or double quotes) doesn’t work. The reason is that the search variable is not in its correct expanded form. Some extra substitutions seems possible, but I haven’t been able to get this to work correctly for all the three different ways of escaping.Jeremie
Yes, compgen seems to be stripping all qoutes... Which means they must be escaped inside _find_wordsHoon
If you need to handle some other special characters ($, `, {, ( or whatever else bash might take offence on -- didn't test those), just escape them in similar way -- first backslash, then everything else in 2 or 3 places.Hoon
There are still some things that doesn’t work, e.g. completing "rock 'n. Most of these can be fixed by removing "$cur" from the _find_words call. Then everything works for strings that starts with a double quote or nothing, except when ‘completing’ an already complete word starting with a double quote (since the last double quote incorrectly gets quotes?). And there are still some problems with completing words beginning with a single quote (only the last part of the word seems to be recognised when the completion suggestion is inserted). But all in all, this solution works OK. Thanks!Jeremie
Also if you have trouble completing arguments like "text 'more_text" where is a single quote inside double quotes, you can try COMPREPLY=( $( compgen -W "$(_pyxbindman_get_choices)" -- "${cur/\'/\'}" ) ) instead of the similar line in the script (it escapes single quotes, compgen needs that by some reason).Mckinnon
F
30

The question is rather loaded but this answer attempts to explain each aspect:

  1. How to handle spaces with COMPREPLY.
  2. How does ls do it.

There're also people reaching this question wanting to know how to implement the completion function in general. So:

  1. How how do I implement the completion function and correctly set COMPREPLY?

How does ls do it

Moreover, why does it behave differently to when I set COMPREPLY?

Back in '12 (before I updated this answer), I was in a similar situation and searched high and low for the answer to this discrepancy myself. Here's the answer I came up with.

ls, or rather, the default completion routine does it using the -o filenames functionality. This option performs: filename-specific processing (like adding a slash to directory names or suppressing trailing spaces.

To demonstrate:

$ foo () { COMPREPLY=("bar one" "bar two"); }
$ complete -o filenames -F foo words
$ words ░

Tab

$ words bar\ ░          # Ex.1: notice the space is completed escaped

TabTab

bar one  bar two        # Ex.2: notice the spaces are displayed unescaped
$ words bar\ ░

Immediately there are two points I want to make clear to avoid any confusion:

  • First of all, your completion function cannot be implemented simply by setting COMPREPLY to an array of your word list! The example above is hard-coded to return candidates starting with b-a-r just to show what happens when TabTab is pressed. (Don't worry, we'll get to a more general implementation shortly.)

  • Second, the above format for COMPREPLY only works because -o filenames is specified. For an explanation of how to set COMPREPLY when not using -o filenames, look no further than the next heading.

Also note, there's a downside of using -o filenames: If there's a directory lying about with the same name as the matching word, the completed word automatically gets an arbitrary slash attached to the end. (e.g. bar\ one/)

How to handle spaces with COMPREPLY without using -o filenames

Long story short, it needs to be escaped.

In contrast to the above -o filenames demo:

$ foo () { COMPREPLY=("bar\ one" "bar\ two"); }     # Notice the blackslashes I've added
$ complete -F foo words                             # Notice the lack of -o filenames
$ words ░

Tab

$ words bar\ ░          # Same as -o filenames, space is completed escaped

TabTab

bar\ one  bar\ two      # Unlike -o filenames, notice the spaces are displayed escaped
$ words bar\ ░

How do I actually implement a completion function?

Implementing a completion functions involves:

  1. Representing your word list.
  2. Filtering your word list to just candidates for the current word.
  3. Setting COMPREPLY correctly.

I'm not going to assume to know all the complex requirements there can be for 1 and 2 and the following is only a very basic implementation. I'm providing an explanation for each part so one can mix-and-match to fit their own requirements.

foo() {
    # Get the currently completing word
    local CWORD=${COMP_WORDS[COMP_CWORD]}

    # This is our word list (in a bash array for convenience)
    local WORD_LIST=(foo 'bar one' 'bar two')

    # Commands below depend on this IFS
    local IFS=$'\n'

    # Filter our candidates
    CANDIDATES=($(compgen -W "${WORD_LIST[*]}" -- "$CWORD"))

    # Correctly set our candidates to COMPREPLY
    if [ ${#CANDIDATES[*]} -eq 0 ]; then
        COMPREPLY=()
    else
        COMPREPLY=($(printf '%q\n' "${CANDIDATES[@]}"))
    fi
}

complete -F foo words

In this example, we use compgen to filter our words. (It's provided by bash for this exact purpose.) One could use any solution they like but I'd advise against using grep-like programs simply because of the complexities of escaping regex.

compgen takes the word list with the -W argument and returns the filtered result with one word per line. Since our words can contain spaces, we set IFS=$'\n' beforehand in order to only count newlines as element delimiters when putting the result into our array with the CANDIDATES=(...) syntax.

Another point of note is what we're passing for the -W argument. This argument takes an IFS delimited word list. Again, our words contain spaces so this too requires IFS=$'\n' to prevent our words being broken up. Incidentally, "${WORD_LIST[*]}" expands with elements also delimited with what we've set for IFS and is exactly what we need.

In the example above I chose to define WORD_LIST literally in code.

One could also initialize the array from an external source such as a file. Just make sure to move IFS=$'\n' beforehand if words are going to be line-delimited such as in the original question:

local IFS=$'\n'
local WORD_LIST=($(cat /path/to/words.dat))`

Finally, we set COMPREPLY making sure to escape the likes of spaces. Escaping is quite complicated but thankfully printf's %q format performs all the necessary escaping we need and that's what we use to expand CANDIDATES. (Note we're telling printf to put \n after each element because that's what we've set IFS to.)

Those observant may spot this form for COMPREPLY only applies if -o filenames is not used. No escaping is necessary if it is and COMPREPLY may be set to the same contents as CANDIDATES with COMPREPLY=("$CANDIDATES[@]").

Extra care should be taken when expansions may be performed on empty arrays as this can lead to unexpected results. The example above handles this by branching when the length of CANDIDATES is zero.

Fundamentalism answered 18/7, 2012 at 7:35 Comment(5)
Your first example doesn't actually work. TAB TAB gets you the list ok, but when you try "words bar t<TAB>" it fails.Directional
You're misunderstanding the purpose of the example. If you want to actually implemented a bash completion routine, it takes a lot more than just setting COMPREPLY to a static array. I'm sure there're more appropriate questions on SO that deal specifically with this.Fundamentalism
if u say so. i took the liberty of providing a fully working example of your example, as an answer.Directional
you can also unset IFS as soon as it's served its purpose. That way subsequent commands can be simplified (e.g. no need to printf ;%q\n)... see github.com/iterative/shtab/pull/106/commits/…Headwaters
@Headwaters It doesn't seem so. It's a local variable so it's already unset automatically at the end of the function. The need for printf %q has nothing to do with IFS. I've replied to the commit you linked detailing how unsetting it before compgen -W is damaging.Fundamentalism
H
8

This not too elegant postprocessing solution seems to work for me (GNU bash, version 3.1.17(6)-release (i686-pc-cygwin)). (Unless I didn't test some border case as usual :))

Don't need to eval things, there are only 2 kinds of quotes.

Since compgen doesn't want to escape spaces for us, we will escape them ourselves (only if word didn't start with a quote). This has a side effect of full list (on double tab) having escaped values as well. Not sure if that's good or not, since ls doesn't do it...

EDIT: Fixed to handle single and double qoutes inside the words. Essentially we have to pass 3 unescapings :). First for grep, second for compgen, and last for words command itself when autocompletion is done.

_find_words()
{
    search=$(eval echo "$cur" 2>/dev/null || eval echo "$cur'" 2>/dev/null || eval echo "$cur\"" 2>/dev/null || "")
    grep -- "^$search" words.dat | sed -e "{" -e 's#\\#\\\\#g' -e "s#'#\\\'#g" -e 's#"#\\\"#g' -e "}"
}

_words_complete()
{
    local IFS=$'\n'

    COMPREPLY=()
    local cur="${COMP_WORDS[COMP_CWORD]}"

    COMPREPLY=( $( compgen -W "$(_find_words)" -- "$cur" ) )

    local escaped_single_qoute="'\''"
    local i=0
    for entry in ${COMPREPLY[*]}
    do
        if [[ "${cur:0:1}" == "'" ]] 
        then
            # started with single quote, escaping only other single quotes
            # [']bla'bla"bla\bla bla --> [']bla'\''bla"bla\bla bla
            COMPREPLY[$i]="${entry//\'/${escaped_single_qoute}}" 
        elif [[ "${cur:0:1}" == "\"" ]] 
        then
            # started with double quote, escaping all double quotes and all backslashes
            # ["]bla'bla"bla\bla bla --> ["]bla'bla\"bla\\bla bla
            entry="${entry//\\/\\\\}" 
            COMPREPLY[$i]="${entry//\"/\\\"}" 
        else 
            # no quotes in front, escaping _everything_
            # [ ]bla'bla"bla\bla bla --> [ ]bla\'bla\"bla\\bla\ bla
            entry="${entry//\\/\\\\}" 
            entry="${entry//\'/\'}" 
            entry="${entry//\"/\\\"}" 
            COMPREPLY[$i]="${entry// /\\ }"
        fi
        (( i++ ))
    done
}
Hoon answered 18/7, 2009 at 4:22 Comment(6)
Thanks. This solution works for the original examples, but if I add ‘rock 'n roll’ to words.dat, it fails. My real-life use of the autocompletion actually involves words with apostrophes, and that’s the reason I originally used eval. It’s easy enough (though not very elegant) to fix, by adding an extra ‘search and replace’ to the `for loop, and then add another for loop for strings beginning with '. The only remaining problem, as far as I can see, is that the autocompletion does not advance one cursor position if you have written an entire word, including any closing quotes.Jeremie
Regarding my comment above. It looks like the situation is slightly worse than I thought. Auto-completion inside a word containing apostrophes (e.g., trying to autocomplete ‘rock 'n ro’, either written using escaped spaces and apostrophe, or single or double quotes) doesn’t work. The reason is that the search variable is not in its correct expanded form. Some extra substitutions seems possible, but I haven’t been able to get this to work correctly for all the three different ways of escaping.Jeremie
Yes, compgen seems to be stripping all qoutes... Which means they must be escaped inside _find_wordsHoon
If you need to handle some other special characters ($, `, {, ( or whatever else bash might take offence on -- didn't test those), just escape them in similar way -- first backslash, then everything else in 2 or 3 places.Hoon
There are still some things that doesn’t work, e.g. completing "rock 'n. Most of these can be fixed by removing "$cur" from the _find_words call. Then everything works for strings that starts with a double quote or nothing, except when ‘completing’ an already complete word starting with a double quote (since the last double quote incorrectly gets quotes?). And there are still some problems with completing words beginning with a single quote (only the last part of the word seems to be recognised when the completion suggestion is inserted). But all in all, this solution works OK. Thanks!Jeremie
Also if you have trouble completing arguments like "text 'more_text" where is a single quote inside double quotes, you can try COMPREPLY=( $( compgen -W "$(_pyxbindman_get_choices)" -- "${cur/\'/\'}" ) ) instead of the similar line in the script (it escapes single quotes, compgen needs that by some reason).Mckinnon
D
5
_foo ()
{
  words="bar one"$'\n'"bar two"
  COMPREPLY=()
  cur=${COMP_WORDS[COMP_CWORD]}
  prev=${COMP_WORDS[COMP_CWORD-1]}
  cur=${cur//\./\\\.}

  local IFS=$'\n'
  COMPREPLY=( $( grep -i "^$cur" <( echo "$words" ) | sed -e 's/ /\\ /g' ) )
  return 0
}

complete -o bashdefault -o default -o nospace -F _foo words 
Directional answered 13/12, 2013 at 16:46 Comment(9)
Why are you escaping . into \. on $cur?Photodisintegration
But bash substitutions are never regular expressions, right? Consider: text='foo.bar'; echo "${text//./_}" - that works as a literal replace of the dot on GNU bash 4.3.11 (Linux) and 3.1.20 (Windows).Photodisintegration
But when it's passed to grep, grep is going to interpret it as a regular expression.Directional
What I mean is, cur=${cur//./\\.} has the same effect. Or should, that's why I'm asking. (By the way, one alternative to using grep in this case is to separate by line, trim each line to the length of the pattern string, and test if both are identical - this is what I'm using as it's not only faster when using just builtins but avoids other characters that have special meaning for grep).Photodisintegration
Apparently you are correct. cur=${cur//./\\.} works, as does cur=${cur//.\\\.}. Must be pure reflex after seeing that / to go into regex mode. as for faster inline methods, fabulous idea, but auto-complete files are already so hard to read for other people.Directional
Since this is indeed rather hard to read I've made it into a function that one can "just use". Only shell builtins, multiple prefixes possible, case insensitive option, and just pipe stdin and give line prefixes as parameters. pastebin.com/weMpizK6Photodisintegration
This is the only solution that worked for me (out of all the current answers in SO). Thanks, @Directional !Inessential
@Inessential Thanks, that's because I only come to SO when I'm trying to solve my own problems, and when I'm not satisfied with the answers I find, I work it out myself, and return to supply a decent answer. It doesn't earn me many points, but comments like yours do make it worthwhile.Directional
omg, some ass just downvoted without explanation. i'm not an SO point-_w_hore, but seriously.Directional
T
1

Pipe _find_words through sed and have it enclose each line in quotation marks. And when typing a command line, make sure to put either " or ' before a word to be tab-completed, otherwise this method will not work.

_find_words() { cat words.dat; }

_words_complete()
{

  COMPREPLY=()
  cur="${COMP_WORDS[COMP_CWORD]}"

  local IFS=$'\n'
  COMPREPLY=( $( compgen -W "$( _find_words | sed 's/^/\x27/; s/$/\x27/' )" \
                         -- "$cur" ) )

}

complete -F _words_complete words

Command line:

$ words "ba░

tab

$ words "bar ░

tabtab

bar one  bar two
$ words "bar o░

tab

$ words "bar one" ░
Tanguy answered 17/7, 2015 at 15:40 Comment(0)
T
0

I solved this by creating my own function compgen2 which handles the extra processing when the current word doesn't begin with a quote character. otherwise it works similar to compgen -W.

compgen2() {
    local IFS=$'\n'
    local a=($(compgen -W "$1" -- "$2"))
    local i=""
    if [ "${2:0:1}" = "\"" -o "${2:0:1}" = "'" ]; then
        for i in "${a[@]}"; do
            echo "$i"
        done
    else
        for i in "${a[@]}"; do
            printf "%q\n" "$i"
        done
    fi
}

_foo() {
    local cur=${COMP_WORDS[COMP_CWORD]}
    local prev=${COMP_WORDS[COMP_CWORD-1]}
    local words=$(cat words.dat)
    local IFS=$'\n'
    COMPREPLY=($(compgen2 "$words" "$cur"))
}

echo -en "foo\nbar one\nbar two\n" > words.dat
complete -F _foo foo
Thapsus answered 6/3, 2017 at 22:14 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.