Is it possible to change argv or do I need to create an adjusted copy of it?
Asked Answered
Z

8

37

My application has potentially a huge number of arguments passed in and I want to avoid the memory of hit duplicating the arguments into a filtered list. I would like to filter them in place but I am pretty sure that messing with argv array itself, or any of the data it points to, is probably not advisable. Any suggestions?

Zsigmondy answered 8/6, 2009 at 5:14 Comment(1)
Andrew, that's the limit on the length of the command line, not on the number of arguments. Windows doesn't count arguments, just characters. The application splits it into arguments. (That's often handled by the compiler-vendor-supplied C RTL portion of the application, but it's still the app and not the OS.)Castile
M
30

Once argv has been passed into the main method, you can treat it like any other C array - change it in place as you like, just be aware of what you're doing with it. The contents of the array don't have an effect on the return code or execution of the program other than what you explicitly do with it in code. I can't think of any reason it wouldn't "be advisable" to treat it specially.

Of course, you still need to take care about accidentally accessing memory beyond the bounds of argv. The flip side of it being accessible like a normal C array is that it's also prone to access errors just like any other normal C array. (Thanks to all who pointed this out in comments and other responses!)

Mortmain answered 8/6, 2009 at 5:18 Comment(2)
The elements are like any other strings allocated elsewhere, of course, be sure not to walk off the end of them. You don't know what is out there.Coppinger
Note that (on Linux, at least) you can overwrite the contents of argv to hide passwords from ps: unix.stackexchange.com/a/88679/27530Yogurt
S
80

The C99 standard says this about modifying argv (and argc):

The parameters argc and argv and the strings pointed to by the argv array shall be modifiable by the program, and retain their last-stored values between program startup and program termination.

Seminole answered 8/6, 2009 at 7:28 Comment(0)
M
30

Once argv has been passed into the main method, you can treat it like any other C array - change it in place as you like, just be aware of what you're doing with it. The contents of the array don't have an effect on the return code or execution of the program other than what you explicitly do with it in code. I can't think of any reason it wouldn't "be advisable" to treat it specially.

Of course, you still need to take care about accidentally accessing memory beyond the bounds of argv. The flip side of it being accessible like a normal C array is that it's also prone to access errors just like any other normal C array. (Thanks to all who pointed this out in comments and other responses!)

Mortmain answered 8/6, 2009 at 5:18 Comment(2)
The elements are like any other strings allocated elsewhere, of course, be sure not to walk off the end of them. You don't know what is out there.Coppinger
Note that (on Linux, at least) you can overwrite the contents of argv to hide passwords from ps: unix.stackexchange.com/a/88679/27530Yogurt
C
8

The latest draft of the C standard (N1256) states that there are two allowed forms of the main function:

int main (void);
int main (int argc, char* argv[]);

but the crux is the clause "or in some other implementation-defined manner". This seems to me to be a loophole in the standard large enough to drive a semi-trailer through.

Some people specifically use const char * for the argv to disallow changes to the arguments. If your main function is defined that way, you are not permitted to change the characters that argv[] points to, as evidenced by the following program:

pax> cat qq.c
#include <stdio.h>
int main (int c, const char *v[]) {
    *v[1] = 'X';
    printf ("[%s]\n", v[1]);
    return 0;
}

pax> gcc -o qq qq.c
qq.c: In function `main':
qq.c:3: error: assignment of read-only location

However, if you remove the const, it works fine:

pax> cat qq2.c
#include <stdio.h>
int main (int c, char *v[]) {
    *v[1] = 'X';
    printf ("[%s]\n", v[1]);
    return 0;
}

pax> gcc -o qq2 qq2.c ; ./qq2 Hello
[Xello]

I think this is also the case for C++. The current draft states:

All implementations shall allow both of the following definitions of main:
    int main();
    int main(int argc, char* argv[]);

but it doesn't specifically disallow other variants so you could presumably accept a const version in C++ as well (and, in fact, g++ does).

The only thing you need to be careful of is trying to increase the size of any of the elements. The standards do not mandate how they're stored so extending one argument may (probably will) affect others, or some other unrelated data.

Cyndi answered 8/6, 2009 at 6:50 Comment(2)
To reflect the output "[Xello]", the example should be "./qq2 Hello", not just "./qq2", which gives a segfault.Lacteal
Spot on, @Gilles, I've changed the transcript to suit.Cyndi
Q
4

Empirically, functions such as GNU getopt() permute the argument list without causing problems. As @Tim says, as long as you play sensibly, you can manipulate the array of pointers, and even individual strings. Just don't overrun any of the implicit array boundaries.

Quintain answered 8/6, 2009 at 5:22 Comment(1)
Alright thank you. I am not sure why I thought the memory was special. I never had the need to do this before.Zsigmondy
F
4

Some libraries do this!

The initialization method provided by the glut opengl library (GlutInit) scans for glut related arguments, and clears them by moving the subsequent elements in argv forward (moving the pointers, not the actual strings) and decrementing argc

2.1

glutInit glutInit is used to initialize the GLUT library.

Usage

void glutInit(int *argcp, char **argv);

argcp

A pointer to the program's unmodified argc variable from main. Upon return, the value pointed to by argcp will be updated, because glutInit extracts any command line options intended for the GLUT library.

argv

The program's unmodified argv variable from main. Like argcp, the data for argv will be updated because glutInit extracts any command line options understood by the GLUT library.

Fineable answered 18/5, 2018 at 9:3 Comment(0)
F
2

The operating system push the argv and argc into the applicaton stack before executing it, and you can treat them like any other stack variables.

Fricandeau answered 8/6, 2009 at 7:48 Comment(0)
B
1

The only time I would say that directly manipulating argv is a bad idea would be when an application changes its behavior depending on the contents of argv[0].

However, changing a program's behavior depending on argv[0] is in itself a very bad idea where portability is a concern.

Other than that, you can treat it just like you would any other array. As Jonathan said, GNU getopt() permutes the argument list non-destructively, I've seen other getopt() implementations that go as far as serializing and even hashing the arguments (useful when a program comes close to ARG_MAX).

Just be careful with your pointer arithmetic.

Brandibrandice answered 8/6, 2009 at 5:56 Comment(4)
There are some nasty tricks where programs can be a passthrough for another program and I have seen all sorts of horrible manipulations of argv[0]. In my case I will be blanking some of the arguments out (setting them to an empty null terminated string) so they are not revisited. Memory is more important in my particular circumstance and I know creating a proper data sructure is preferable. Thanks for your input!Zsigmondy
@Zsigmondy In that case, I'd just build a hash table that has the 'undesired' arguments in them, move them to the right and (finally) snip them. I had almost the same problem on an appliance that had a very complicated LCD controller where the client was using an over-simplifed CLI program to interact with / read it. I had roughly 8MB to work with, so I understand the need for the savings.Brandibrandice
You've provided your opinion about touching argv[0] but what are the arguments other than perhaps portability to non-unix systems ?Godbey
@JohanBoulé Practically speaking, I've never run into problems beyond that. It's helpful to run the program under something like Valgrind just to make sure it's doing what you think it's doing, but I haven't any specific experience with it exploding.Brandibrandice
S
1

The original allocation of argv is left as a compiler/runtime choice. So it may not be safe to modify it willy-nilly. Many systems build it on the stack, so it is auto-deallocated when main returns. Other build it on the heap, and free it (or not) when main returns.

It is safe to change the value of an argument, as long as you don't try to make it longer (buffer overrun error). It is safe to shuffle the order of the arguments.

To remove arguments you've preprocessed, something like this will work:

( lots of error conditions not checked for, "--special" other that first arg not checked for, etc. This is, after all, just a demo-of-concept. )

int main(int argc, char** argv)
{
    bool doSpecial = false; // an assumption
    if (0 == strcmp(argv[1], "--special"))
    {
        doSpecial = true; // no longer an assumption
        // remove the "--special" argument
        //  but do copy the NULL at the end.
        for(int i=1; i<argc; ++i)
            argv[i]  = argv[i+1];
        --argc;
    }
    // all normal processing with "--special" removed.
    // the doSpecial flag is available if wanted.
    return 0;
}

But see this for full manipulation: (the part of the libiberty library that is used to manipulates argv style vectors)

http://www.opensource.apple.com/source/gcc/gcc-5666.3/libiberty/argv.c

It is licensed GNU LGPL.

Schwartz answered 10/2, 2016 at 14:53 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.