execlp("malicious_program", "ls", NULL); // program name not in argv[0]
There's nothing wrong with this. Many programs can be run with different argv[0]
values, and they use this as a flag to execute differently.
For example, if a shell is run with the first character of argv[0]
being the -
character, it executes as a login shell; this is a historical artifact of there not being a standard for arguments to shells, but the login system needs a way of telling the shell that it should operate in login mode.
On some systems, sh
and bash
are the same program; it checks argv[0]
to determine whether to enable bash extensions.
If telnet
is executed with some other argv[0]
, it's taken to be the destination hostname; this allows you to make symlinks to telnet
with the name of a server, and use them as a shortcuts.
Programs that don't make any special use of argv[0]
(the vast majority) completely ignore it. Putting a "malicious" value there will have no effect.
Why would malicious_program
care that you put ls
in argv[0]
? Did you write that backwards, and intend this:
execlp("ls", "malicious_program", NULL);
execlp("ls", "ls", NULL, "-al", NULL); // NULL prior to end
Any arguments after NULL
will be ignored, because the NULL
argument is how execlp()
determines where the end of the arguments is. Variadic argument lists don't provide any way for the function to determine the actual number of arguments. So there's no way for any argument before argc
to be null, because the null value determines the value of argc
.
execlp("ls", "ls"); // no NULL at end
This will cause undefined behavior. Without NULL
, execlp()
doesn't know how many arguments there are (see above), and it will try to access nonexistent arguments into argv
.
As for whether it's safe to access these strings, I think it should always be. When argv
is constructed by the program loader, it constructs a brand new set of strings, it doesn't simply pass along the pointers that were provided to exec*()
. The actual layout of argv
in memory is a single block of memory, with each argument consecutively allocated. E.g. if argv
is
argv[0] = "ls"
argv[1] = "ls"
argv[2] = "filename"
argv[3] = NULL
the memory holding all the arguments will look like:
ls\0ls\0filename\0\0
and the elements of argv
are pointers into this block.
So these pointers will never be invalid, they will always point into this block. The strings themselves may be meaningless, of course. In the case where you don't provide a NULL
argument, it will copy garbage strings into argv
.
char str[] = { 'a' }; execlp("ls", "ls", str, NULL);
. I find that it is undefined in that the byte\0
will show up randomly so you get varying results with different executions. As someone on the receiving end ofargv
, do I simply treat it with the expectation of it being proper? – Misbehavior