Alternative way to obtain argc and argv of a process
Asked Answered
P

12

29

I'm looking for alternative ways to obtain the command line parameters argc and argv provided to a process without having direct access to the variables passed into main().

I want to make a class that is independent of main() so that argc and argv don't have to be passed explicitly to the code that uses them.

EDIT: Some clarification seems to be in order. I have this class.

class Application
{
  int const argc_;
  char const** const argv_;

public:
  explicit Application(int, char const*[]);
};

Application::Application(int const argc, char const* argv[]) :
  argc_(argc),
  argv_(argv)
{
}

But I'd like a default constructor Application::Application(), with some (most probably) C code, that pulls argc and argv from somewhere.

Polypetalous answered 3/5, 2016 at 7:26 Comment(9)
What do you mean with "obtain"? Where else from?Rafi
There's no portable or standard way of getting arguments to the program except the arguments to main.Freddie
@PaulR yes, yes, the APIs are in C, as you know, but class is in C++. I didn't want an answer featuring python, even though it would be cool.Polypetalous
@JoachimPileborg True, but the question is OS-specific, non-portable.Polypetalous
As for the user not needing to pass argc and argv to the "class", take a look at just about all portable and platform-independent GUI frameworks, they all needs the user of the framework to pass argc and argv to the framework explicitly. So it's common and not something programmers are unused to.Freddie
You tag this question linux, windows, posix and bsd and say this question is OS-specific? If it was OS-specific you would mention only one OS, the one you target.Freddie
@JoachimPileborg The frameworks contain OS-specific code, not just portable code, there is not reason whatsoever why not make the command line gathering part OS-specific too. I'm trying to gather as many ways to gather the command line as possible, but not infinite and not opinion-based.Polypetalous
chicken-egg problem, if you want a code that provides argument, then that code have to rely on main, directly or indirectly you will have a main.Folkways
We should stop upvoting ill-formed questions before getting the question right. The question is still not clear to me, you basically want to get those parameters, without the need to pass them? You can create a wrapper that if injected provides parameters indirectly. I just think that once requirements of OP are clear, then we can just proof if what he's want is even possibleFolkways
L
37

On Linux, you can get this information from the process's proc file system, namely /proc/$$/cmdline:

int pid = getpid();
char fname[PATH_MAX];
char cmdline[ARG_MAX];
snprintf(fname, sizeof fname, "/proc/%d/cmdline", pid);
FILE *fp = fopen(fname);
fgets(cmdline, sizeof cmdline, fp);
// the arguments are in cmdline
Longhair answered 3/5, 2016 at 7:38 Comment(6)
1. you can read /proc/self/cmdline 2. args could have been destroyed 3. it is super odd to deal with /proc files with FILE abstraction 4. ARG_MAX is only a limit for one argument, not all args in total 5. missing error checks, arguably could be omitted in the example. However, the biggest issue is that OP is likely trying to do something wrong and such answer like this should not be posted in the first place without further clarification.Samson
I believe that /proc/*/cmdline is truncated to some max length constant. This is not a limitation of the size of the process command-line that the OS can start. Instead, this is a limitation of the process list record keeping in Linux. So you can start a process with a longer list of arguments, but the kernel is not going to remember them all for the purpose of trying to read arguments from the process list.Chondrite
The kernel does not memorize arguments. Instead it stores addresses for both start and end of said args and then reads them from target process address space. I don't see any code truncating the result either (unless the requested amount is too small of course). lxr.free-electrons.com/source/fs/proc/base.c#L199Samson
Trying to use this, but only getting the command name (with path, if was explicitly specified). So cat /proc/self/cmdline x y z returns cat/proc/self/cmdlinexyz, but fgets on that file returns just cat. Why is that?Melanson
(answering myself) - Turns out the args are NULL-separated. Can't just use a string :(Melanson
This is such a hack... Such an amazingly beautiful, delightful hack!Duran
F
30

The arguments to main are defined by the C runtime, and the only standard/portable way to obtain the command line arguments. Don't fight the system. :)

If all you want to do is to provide access to command line parameters in other parts of the program with your own API, there are many ways to do so. Just initialise your custom class using argv/argc in main and from that point onward you can ignore them and use your own API. The singleton pattern is great for this sort of thing.

To illustrate, one of the most popular C++ frameworks, Qt uses this mechanism:

int main(int argc, char* argv[])
{
    QCoreApplication app(argc, argv);

    std::cout << app.arguments().at(0) << std::endl;

    return app.exec();
}

The arguments are captured by the app and copied into a QStringList. See QCoreApplication::arguments() for more details.

Similarly, Cocoa on the Mac has a special function which captures the command line arguments and makes them available to the framework:

#import <Cocoa/Cocoa.h>

int main(int argc, char *argv[])
{
    return NSApplicationMain(argc, (const char **)argv);
}

The arguments are then available anywhere in the app using the NSProcessInfo.arguments property.

I notice in your updated question that your class directly stores a copy of argc/argv verbatim in its instance:

int const argc_;
char const** const argv_;

While this should be safe (the lifetime of the argv pointers should be valid for the full lifetime of the process), it is not very C++-like. Consider creating a vector of strings (std::vector<std::string>) as a container and copy the strings in. Then they can even be safely mutable (if you want!).

I want to make a class that is independent of main() so that argc and argv don't have to be passed explicitly to the code that uses them.

It is not clear why passing this info from main is somehow a bad thing that is to be avoided. This is just how the major frameworks do it.

I suggest you look at using a singleton to ensure there is only one instance of your Application class. The arguments can be passed in via main but no other code need know or care that this is where they came from.

And if you really want to hide the fact that main's arguments are being passed to your Application constructor, you can hide them with a macro.

Flatiron answered 3/5, 2016 at 7:37 Comment(9)
Classes in C? Say it ain't so!Swadeshi
Upvoting this. The typical method I see used for API's that want CL access to parse their own arguments (eg: QT) is to just ask for argv and argc. The nice thing about using the standard idiom is that experienced programmers will know what you are doing at a glance.Janellajanelle
I understand the idea of using a global (it's ugly, but it does the job done), however WHY enforcing uniqueness? What's wrong with allowing anyone to create a fake set of arguments and pass those instead?Muumuu
It's exactly what I don't want. It may be wrong to find argc and argv using alternative means, but why should people be discouraged from doing so anyway? There are situations apart from the one I described, where it seems to be useful to be able to do so.Polypetalous
@MatthieuM. The singleton is merely to preserve the same semantics as the data it encapsulates; there is one set of read-only arguments. Not a strong requirement.Flatiron
From a strategic point of view, many SE users consider any singleton use to be code smell, and will downvote an otherwise good answer just for mentioning them. As someone who upvoted this answer, I think that would be a shame. If its not central to your answer, no point in suggesting something controversial.Janellajanelle
@Polypetalous You say that you don't want to get access the arguments the standard way, but not why. Accessing the arguments from another part of the application is possible, regardless of how they are obtained in the first place. If you don't want to use the defined mechanism, you would need to circumvent the C runtime startup code, which opens a whole can of worms.Flatiron
@Flatiron Interestingly, on macOS, NSApplicationMain actually ignores the passed in argc and argv and gets it directly from _NSGetArgc and _NSGetArgv.Acceptation
@SaagarJha Interesting, yes I just looked up the implantation in opensource.apple.com/source/Libc/Libc-763.13/sys/… . It relies on very specific and detailed knowledge of dyld and libSystem internals.Flatiron
T
19

I totally agree with @gavinb and others. You really should use the arguments from main and store them or pass them where you need them. That's the only portable way.

However, for educational purposes only, the following works for me with clang on OS X and gcc on Linux:

#include <stdio.h>

__attribute__((constructor)) void stuff(int argc, char **argv)
{
    for (int i=0; i<argc; i++) {
        printf("%s: argv[%d] = '%s'\n", __FUNCTION__, i, argv[i]);
    }
}

int main(int argc, char **argv)
{
    for (int i=0; i<argc; i++) {
        printf("%s: argv[%d] = '%s'\n", __FUNCTION__, i, argv[i]);
    }
    return 0;
}

which will output:

$ gcc -std=c99 -o test test.c && ./test this will also get you the arguments
stuff: argv[0] = './test'
stuff: argv[1] = 'this'
stuff: argv[2] = 'will'
stuff: argv[3] = 'also'
stuff: argv[4] = 'get'
stuff: argv[5] = 'you'
stuff: argv[6] = 'the'
stuff: argv[7] = 'arguments'
main: argv[0] = './test'
main: argv[1] = 'this'
main: argv[2] = 'will'
main: argv[3] = 'also'
main: argv[4] = 'get'
main: argv[5] = 'you'
main: argv[6] = 'the'
main: argv[7] = 'arguments'

The reason is because the stuff function is marked as __attribute__((constructor)) which will run it when the current library is loaded by the dynamic linker. That means in the main program it will run even before main and have a similar environment. Therefore, you're able to get the arguments.

But let me repeat: This is for educational purposes only and shouldn't be used in any production code. It won't be portable and might break at any point in time without warning.

Tope answered 3/5, 2016 at 19:0 Comment(3)
That's pretty nifty. Are __attribute__((constructor)) functions guaranteed to have access to argv/argc, or does this just abuse some accident of the representation?Leid
It's not guaranteed at all.Tope
glibc and dyld implement this extension; musl does not. Take of that what you will.Acceptation
G
16

To answer the question in part, concerning Windows, the command line can be obtained as the return of the GetCommandLine function, which is documented here, without explicit access to the arguments of the main function.

Giese answered 3/5, 2016 at 7:30 Comment(2)
This is a popular answer. You may consider adding an example in-line of how to use GetCommandLine.Kleiman
Or just with __argc, __argv and __wargv.Berceuse
U
5

In Windows, if you need to get the arguments as wchar_t *, you can use CommandLineToArgvW():

int main()
{
    LPWSTR *sz_arglist;
    int n_args;
    int result;
    sz_arglist = CommandLineToArgvW(GetCommandLineW(), &n_args);
    if (sz_arglist == NULL)
    {
        fprintf(stderr, _("CommandLineToArgvW() failed.\n"));
        return 1;
    }
    else
    {
        result = wmain(n_args, sz_arglist);
    }
    LocalFree(sz_arglist);
    return result;
}

This is very convenient when using MinGW because gcc does not recognize int _wmain(int, wchar_t *) as a valid main prototype.

Urease answered 3/5, 2016 at 8:31 Comment(0)
F
5

Passing values doesn't constitute creating a dependency. Your class doesn't care about where those argc or argv values come out of - it just wants them passed. You may want to copy the values somewhere, though - there's no guarantee that they are not changed (the same applies to alternate methods like GetCommandLine).

Quite the opposite, in fact - you're creating a hidden dependency when you use something like GetCommandLine. Suddenly, instead of a simple "pass a value" semantics, you have "magically take their inputs from elsewhere" - combined with the aforementioned "the values can change at any time", this makes your code a lot more brittle, not to mention impossible to test. And parsing command line arguments is definitely one of the cases where automated testing is quite beneficial. It's a global variable vs. a method argument approach, if you will.

Fichte answered 3/5, 2016 at 14:8 Comment(2)
But the OP states that this is in C, so there are no classes. You could pass them to an init function, and have the values stored in static variables in the module code.Hiss
@Hiss Well, I'm just using the same name the OP used :) Passing the values is the important part, not how exactly it is implemented.Fichte
S
3

In C/C++, if main() doesn't export them, then there isn't a direct way to access them; however, that doesn't mean there isn't an indirect way. Many Posix-like systems use the elf format which passes argc, argv and envp on the stack in order to be initialized by _start() and passed into main() via normal calling convention. This is typically done in assembly (because there is still no portable way to get the stack pointer) and put in a "start file", typically with some variation of the name crt.o.

If you don't have access to main() so that you can just export the symbols, you probably aren't going to have access to _start(). So why then, do I even mention it? Because of that 3rd parameter envp. Since environ is a standard exported variable that does get set during _start() using envp. On many ELF systems, if you take the base address of environ and walk it backwards using negative array indices you can deduce the argc and argv parameters. The first one should be NULL followed by the last argv parameter until you get to the first. When the pointed to value cast to long is equal to the negative of your negative index, you have argc and the next (one more than your negative index) is argv/argv[0].

Steroid answered 3/5, 2016 at 22:41 Comment(1)
I wrote an Android app in FreePascal, and got reports that it was crashing on Android 13. In Pascal there is a function that you can call to get the args at any time. Normally, FreePascal copies the args in its main. But on Android, it is not an executable, but a .so which is loaded by the JVM, so it cannot access the args. So FreePascal has to search the args later, and it does this exactly in this way by walking backwards from environ. And apparently this is causing the crash. It is a rather bad ideaAurify
R
2

There are few common scenarios with functions requiring arguments of type int argc, char *argv[] known to me. One such obvious example is GLUT, where its initializing function is taking over these arguments from main(), which is kind of "nested main" scenario. This may or may not be your desired behavior. If not, as there is no convention for naming these arguments, as long as your function has its argument parser, and you know what you're doing, you can do whatever you need, hardcoded:

int foo = 1;
char * bar[1] = {" "};

or read from user input or generated otherwise, AFAIK.

int myFunc( int foo, char *bar[]){
//argument parser
 {…   …}
return 0;
}

Please, see this SO post.

Rafi answered 3/5, 2016 at 7:44 Comment(0)
R
2

The most portable way would be to use a global variable for storing the parameters. You can make this less ugly by using a Singleton (like your class in the question, but a singleton initialized by main) or similar a Service Locator which is basically just the same: Create an object in main, pass and store params statically, and have another or the same class access them.

Non-Portable ways are using GetCommandLine in Windows, accessing /proc/<pid>/cmdline or (/proc/self/cmdline), or using compiler-specific extensions like __attribute__((constructor))

Note that getting the command line in function via an equivalent of GetCommandLine is not possible (TLDR: Commandline is not passed to the Linux kernel, but already parsed and split by the invoking process (e.g. shell))

Roping answered 10/5, 2016 at 11:33 Comment(2)
Actually, on Windows, as far as I've been able to tell, the command line for a process is passed to the kernel as a string. The Win32 API function family CreateProcess takes a string and seems to pass it through a few other functions to the raw system call untouched. The _exec and similar functions actually combine the vector they take into a single string (!) without even bothering to quote it (!!!) which is passed to CreateProcess.Scurry
The link I posted refers to Linux. I edited the answer to clarify this, thanks for the info!Roping
C
2

Here is a proper c++ way of doing so:

#include <iostream>
#include <fstream>
#include <unistd.h>
#include <sstream>
#include <vector>

using namespace std;


template <typename T>
string to_str(T value ){
    ostringstream ss;
    ss << value;
    return ss.str();
    }

int main(int argc, char** argv){
    ifstream reader("/proc/" + to_str(getpid()) + "/cmdline", ios::binary);
    vector<unsigned char> buffer(istreambuf_iterator<char>(reader), {});
    
    int length = buffer.size();
    for(int i = 0; i < length; i++){
        if(!buffer[i]){
            cout << endl;
            }else{cout << buffer[i];}
        }
    
    return 0;
    }
Cholecystitis answered 20/12, 2022 at 15:34 Comment(2)
It's important to not just post code, but to also include a description of what the code does and why you are suggesting it. This helps others understand the context and purpose of the code, and makes it more useful for others who may be reading the question or answerCulmination
The code is just an interpretation of the C answer by @Longhair written in c++ as the question was clearly asking for a c++ way. The rest is explained in the original answer.Cholecystitis
S
1

It sounds like what you want is a global variable; what you should do is just pass argc and argv as parameters.

Shanta answered 3/5, 2016 at 16:21 Comment(2)
That's exactly what I don't want to do.Polypetalous
So you want a getter for a global variable, but you don't want the global variable eh?Folkways
U
1

An instructor had a challenge to use the gcc option nostartfiles and then try to access argc and argv. The nostartfiles option causes argc and argv to not be populated. This was my best solution for 64 bit Linux as it accesses argc and argv directly using the base pointer:

// Compile with gcc -nostartfiles -e main args.c -o args
#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[]) // argc and argv are not available when compiled with -nostartfiles
{
    register void *rbp asm ("rbp");

    printf("argc is %ld\n", *(unsigned long *)(rbp + 8));

    for(int count = 0 ; count < *(unsigned long *)(rbp + 8) ; count++)
    {
        printf("argv[%d] is %s\n", count, *(char **)(rbp + 16 + count * 8));
    }

    exit(0);
}
University answered 27/8, 2021 at 17:39 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.