Strange behavior of argv when passing string containing "!!!!"
Asked Answered
Z

3

31

I have written a small program that takes some input parameters from *argv[] and prints them. In almost all use cases my code works perfectly fine. A problem only arises when I use more than one exclamation mark at the end of the string I want to pass as an argument ...

This works:

./program -m "Hello, world!"

This does NOT work:

./program -m "Hello, world!!!!"

^^ If I do this, the program output is either twice that string, or the command I entered previous to ./program.

However, what I absolutely don't understand: The following, oddly enough, DOES work:

./program -m 'Hello, world!!!!'

^^ The output is exactly ...

Hello, world!!!!

... just as desired.

So, my questions are:

  • Why does this strange behavior occur when using multiple exclamation marks in a string?
  • As far as I know, in C you use "" for strings and '' for single chars. So why do I get the desired result when using '', but not when using "" as I should (in my understanding)?
  • Is there a mistake in my code or what do I need to change to be able to enter any string (no matter if, what, and how many punctuation marks are used) and get exactly that string printed?

The relevant parts of my code:

// this is a simplified example that, in essence, does the same 
// as my (significantly longer) code
int main(int argc, char* argv[]) {
    char *msg = (char *)calloc(1024, sizeof(char));

    printf("%s", strcat(msg, argv[2])); // argv[1] is "-m"

    free(msg);
}

I already tried copying the content of argv[2] into a char* buffer first and appending a '\0' to it, which didn't change anything.

Zaslow answered 8/2, 2018 at 13:46 Comment(17)
Why printf("%s", strcat(msg, argv[2])) instead of printf("%s", argv[2]))??Chambray
The space, you need to escape it. Also, calloc() there doesn't make a lot of sense. Just char msg[1024] is very good. When you use the single qoutes then the string is passed as is. This has nothing to do with argv or the c programming language, but with the shellAssign
@Michael Walz: Because I'm creating a much longer string in msg. Appending the content of argv to it is only the first of many steps in my full code. Sorry for not clarifying that earlier.Zaslow
The exclamation mark is a special character for your shell (probably bash). If it is not placed in single quotes, the shell interprets !! and replaces it with something else (the previous command in history). Your program works correctly, it prints what it receives from the shell in the command line.Copolymerize
@Iharob Al Asimi: I thought that by using calloc (instead of malloc) I was zeroing out the whole memory area, thus not having to use '\0' at the end of the string.Zaslow
@Zaslow OK; but next time please narrow it down as much as possible. But, anyway, here the problem was not in your code but it's because of the shell (see answers below)Chambray
I guess those of you hinting at the shell are right. I just tried "Hello, world?????" as program input and get exactly that as output.Zaslow
That leaves me with one more question: Is there a way to ensure that users of my program can still use "Something!!!!" as input parameter without experiencing this behavior? (regardless of the shell they use)Zaslow
@ci7i2en4: it's the way Bash works -- what if your users want to expand the previous command as a parameter to your program?Sauerkraut
@Groo: The problem is, my program is not capable of handling this behavior (and I don't know how to change that). If a user does use "hallo!!" as input, the program is ended with the following error: -bash: syntax error near unexpected token `)'Zaslow
@ci7i2en4: 1) the syntax error is a bash syntax error created during expansion, your program is not even started in that case, 2) your program should be responsible for validating input values, but you cannot "undo" the expansion done by bash. Things you should be doing is: checking if argc has a valid length and checking if parameters make sense, and then probably displaying an error message with usage info.Sauerkraut
^^ Somehow entering "hello!!!! :-)" leads to that bash error and the unexpected token has to be the ) from :-). I don't understand why that is a problem for bash. And I don't know how I should validate this input string when the user should be allowed to input any string they like. I can't expect the program users to know how bash works and what they shouldn't input ...Zaslow
Just tested "(Hello, world!!!! :-))))". In this case I don't get the unexpected token error. I'll keep looking into this ... thanks, everyone!Zaslow
It doesn't do that for me.Mixture
@DonaldDuck You don't appear to be in Bash, which is what the question is about.Overtone
Vote to reopen. The answers in the given dup contain less information than those this question, and this is more widely viewed. Voted to close the other as a dup of this.Kelcie
Since the question is not actually about C at all, and the problem is not caused nor influenced by the C code (but is instead purely a matter of how the system parses the arguments, prior to main being called), the c tag is not appropriate for the question and I have removed it.Trilateration
K
68

This is not related to your code but to the shell that starts it.

In most shells, !! is shorthand for the last command that was run. When you use double quotes, the shell allows for history expansion (along with variable substitution, etc.) within the string, so when you put !! inside of a double-quoted string it substitutes the last command run.

What this means for your program is that all this happens before your program is executed, so there's not much the program can do except check if the string that is passed in is valid.

In contrast, when you use single quotes the shell does not do any substitutions and the string is passed to the program unmodified.

So you need to use single quotes to pass this string. Your users would need to know this if they don't want any substitution to happen. The alternative is to create a wrapper shell script that prompts the user for the string to pass in, then the script would subsequently call your program with the proper arguments.

Kelcie answered 8/2, 2018 at 13:51 Comment(6)
I see, thank you! That leaves me with one more question: Is there a way to ensure that users of my program can still use "Something!!!!" as input parameter without experiencing this behavior? (regardless of the shell they use)Zaslow
@Zaslow That's up to your users to figure out. Perhaps they want that expansion to happen, perhaps not.Kelcie
@ci7i2en4, in Bash, set +o histexpand or set +H to disable history expansion. Other shells may have other settings.Skean
I would comment that any solution that changes expected behavior of the shell is probably a bad idea. A user should know how to send input in whatever shell they are using.Lubricator
^^ Besides, that would only change the behavior of my bash. If other people run my program in their shells, they would still experience the history expansion ...Zaslow
It's handy to look at this post too: Difference between double and single quotes in Bash.Bernardinabernardine
T
10

The shell does expansion in double-quoted strings. And if you read the Bash manual page (assuming you use Bash, which is the default on most Linux distributions) then if you look at the History Expansion section you will see that !! means

Refer to the previous command.

So !!!! in your double-quoted string will expand to the previous command, twice.

Such expansion is not made for single-quoted strings.

So the problem is not within your program, it's due to the environment (the shell) calling your program.

Teratogenic answered 8/2, 2018 at 13:51 Comment(0)
T
8

In addition to the supplied answers, you should remember that echo is your shell friend. If you prefix your command with "echo ", you will see what shell is actually sending to your script.

echo ./program -m "Hello, world!!!!"

This would have showed you some strangeness and might have helped steer you in the right direction.

Trusting answered 8/2, 2018 at 18:32 Comment(6)
echo is actually a very poor choice of tools for this use -- echo "hello world" and echo "hello" "world" have precisely the same output, after all, despite being very different commands.Tennies
Consider instead: print_args() { printf '%q ' "$@"; printf '\n'; } -- thereafter, print_args ./program -m "Hello, world!!!!" will emit arguments in a way that makes their interpretation unambiguous even in the cases that echo gets wrong.Tennies
This is only Charles' opinion. I appreciate that he shows a deficiency in using echo as a tool, especially good that he supplied an alternative. For me, it is a poor choice to rely on a function being on every Linux environment that I visit (or copy/pasta) and remembering that function name. The reader now has two poor choices. One in an answer and the other in a comment.Trusting
re: "opinion" -- see the POSIX specification for echo, particularly the APPLICATION USAGE section, which explicitly advises using printf instead unless the data being passed is restricted to a subset known to be safe.Tennies
printf is guaranteed to be available everywhere (albeit a different format string, such as ' <%s>\n', may be appropriate if portability is a goal); one needn't use a function to wrap it unless one so chooses. Which is to say -- I don't particularly stand by the print_args function I suggested being a good choice. I absolutely stand by echo being a bad one, and printf being the correct replacement, and have normative documentation backing me up on that.Tennies
@CharlesDuffy the output from a printf wrapper such as this might be unambiguous, but I found it quite difficult to interpret in testing. Ironically, I would say the best approach to the problem is a program much like the one in the question.Trilateration

© 2022 - 2024 — McMap. All rights reserved.