How do I find the location of the executable in C? [duplicate]
Asked Answered
T

9

163

Is there a way in C/C++ to find the location (full path) of the current executed program?

(The problem with argv[0] is that it does not give the full path.)

Tamathatamaulipas answered 1/6, 2009 at 7:29 Comment(6)
which operating system?Bagger
I don't think there is a portable way to do this. Does argv[0] have the full path if you invoke the program with a full static path? If so, you could force the user to execute the binary as such, like sshd does.Lecky
Good answer here also: stackoverflow.com/questions/1023306/…Commutable
This question ("How can I find the location of my program in SETTING X?") could really use a tag; it's hard to search for using keywords!Lamond
Another problem with argv[0] is that it is not available if you are trying to do this in a library.Shortage
The top answer in the duplicate is more comprehensive than any or all of the answers here, though it is a more recent question. Consequently, I've closed this as a duplicate of that, rather than vice versa, even though this is the older question and it would normally get the nod as the master question on the topic.Psittacine
S
224

To summarize:

  • On Unixes with /proc really straight and realiable way is to:

    • readlink("/proc/self/exe", buf, bufsize) (Linux)

    • readlink("/proc/curproc/file", buf, bufsize) (FreeBSD)

    • readlink("/proc/self/path/a.out", buf, bufsize) (Solaris)

  • On Unixes without /proc (i.e. if above fails):

    • If argv[0] starts with "/" (absolute path) this is the path.

    • Otherwise if argv[0] contains "/" (relative path) append it to cwd (assuming it hasn't been changed yet).

    • Otherwise search directories in $PATH for executable argv[0].

    Afterwards it may be reasonable to check whether the executable isn't actually a symlink. If it is resolve it relative to the symlink directory.

    This step is not necessary in /proc method (at least for Linux). There the proc symlink points directly to executable.

    Note that it is up to the calling process to set argv[0] correctly. It is right most of the times however there are occasions when the calling process cannot be trusted (ex. setuid executable).

  • On Windows: use GetModuleFileName(NULL, buf, bufsize)

Scintillate answered 1/6, 2009 at 8:40 Comment(15)
Anything that depends on argv[0] being the program name is not reliable. It will work most of the time, but not every time. This problem is hard on unixes without /procJoella
Not all unixes with proc have /proc/self/exe. The layout of /proc is entirely OS-specific and they all do it a bit differently. For example, FreeBSD provides /proc/curproc/file which works the same as Linux's /proc/self/exe. But others may not do this at all.Oscillogram
Note that if someone needs to conceal their tracks, then execl("/home/hacker/.hidden/malicious", "/bin/ls", "-s", (char *)0); leaves argv[0] with an absolute pathname that has nothing whatsoever to do with the name of the file executed. The other information is useful, though; thanks.Psittacine
@Jonathan Leffler Of course it is up to the spawning process to set argv[0] correctly and called program can be easily be fulled. Your example however is odd. What's the point of misleading malicious program? The better example would be the setuid executable. In that kind of code argv[0] cannot be trusted.Scintillate
@lispmachine: it is not so much misleading the malicious program that I'm worried about; it is about misleading a library that the malicious program uses - where the library tries to work out which program is using it to relay the information to a DBMS. Unless you have something like /proc/self/exe, I don't see how the library can work out reliably which program is executing it. It is causing a major headache right now. I'm interested in the AIX and HP-UX solutions to the 'identify the executable' problem, too. (Linux, MacOS X, Solaris, Windows don't seem to be a major problem.)Psittacine
If you don't use /proc (perhaps because it's so system-specific), in practice you're often going to fall into case 3 where you're searching PATH. Unfortunately, this is unreliable: people can (and I often do) run programs with a PATH different than the current shell, and a program looking for itself this way may well get the wrong answer. As others have pointed out, none of these approaches (besides maybe the /proc one) actually works reliably.Frenchpolish
@DavePacheco Changing the $PATH within the shell is not a problem at all, since the spawned process will inherit env variables from shell. The problem would araise when calling process does not set argv[0] to name/path of executable.Scintillate
@JonathanLeffler Code from the library is run with the same privileges as the code of the program using it. As far as I know the only generally accepted method of privilege separation on OS level is to run separate processes.Scintillate
@Scintillate See execle(3). Programs can execute child processes with arbitrary environments that are different than the parent environment that was used to find the child executable. You're right that this would be more of a corner case, as it's not what would happen if someone ran "PATH=... mycmd" at the shell.Frenchpolish
@DavePacheco You could obviously mess up child's environment, but why would you?Scintillate
An obscure corner case: The Linux /proc/self/exe method doesn't quite work if the exe resides in a clearcase MVFS view, since Linux ends up returning the location of the executable in the view storage directory and not the /view qualified path. Example, for /vbs/bldsupp/linuxamd64/clang/debug/bin/llvm-config /proc/self/exe points me at the unfriendly path: /home/peeterj/views/peeterj_clang-7.vws/.s/00024/8000023250b8f17fllvm-tblgenPrecede
On Windows you don't have to call GetModuleFileName. Instead, just #include <windows.h> and use the path string provided automatically by Windows in _pgmptr. It's easier than using the GetModuleFileName function because that has the possibility of failing.Daughterly
@Jonathan I'm interested in AIX and HP-UX too - the only platforms our software supports that we use an argv[0] fallback for. Did you ever find a solution?Trina
@NicholasWilson: We did find solutions for HP-UX and AIX that did not rely on access to argv[0], but I do not now recall what the function calls were (other than 'non-portable'). I don't have direct access to the code any more, so I can't look up what we did. Time to hit the o/s manuals — section 2 and/or section 3.Psittacine
AT_EXECFN is also a possibility.Lookin
H
27

Use GetModuleFileName() function if you are using Windows.

Hamitic answered 1/6, 2009 at 7:40 Comment(1)
Thanks, but I'm using linux and unix.Tamathatamaulipas
S
19

Please note that the following comments are unix-only.

The pedantic answer to this question is that there is no general way to answer this question correctly in all cases. As you've discovered, argv[0] can be set to anything at all by the parent process, and so need have no relation whatsoever to the actual name of the program or its location in the file system.

However, the following heuristic often works:

  1. If argv[0] is an absolute path, assume this is the full path to the executable.
  2. If argv[0] is a relative path, ie, it contains a /, determine the current working directory with getcwd() and then append argv[0] to it.
  3. If argv[0] is a plain word, search $PATH looking for argv[0], and append argv[0] to whatever directory you find it in.

Note that all of these can be circumvented by the process which invoked the program in question. Finally, you can use linux-specific techniques, such as mentioned by emg-2. There are probably equivalent techniques on other operating systems.

Even supposing that the steps above give you a valid path name, you still might not have the path name you actually want (since I suspect that what you actually want to do is find a configuration file somewhere). The presence of hard links means that you can have the following situation:

-- assume /app/bin/foo is the actual program
$ mkdir /some/where/else
$ ln /app/bin/foo /some/where/else/foo     # create a hard link to foo
$ /some/where/else/foo

Now, the approach above (including, I suspect, /proc/$pid/exe) will give /some/where/else/foo as the real path to the program. And, in fact, it is a real path to the program, just not the one you wanted. Note that this problem doesn't occur with symbolic links which are much more common in practice than hard links.

In spite of the fact that this approach is in principle unreliable, it works well enough in practice for most purposes.

Supernova answered 1/6, 2009 at 8:12 Comment(0)
F
13

Not an answer actually, but just a note to keep in mind.

As we could see, the problem of finding the location of running executable is quite tricky and platform-specific in Linux and Unix. One should think twice before doing that.

If you need your executable location for discovering some configuration or resource files, maybe you should follow the Unix way of placing files in the system: put configs to /etc or /usr/local/etc or in current user home directory, and /usr/share is a good place to put your resource files.

Fairhaired answered 2/6, 2009 at 11:34 Comment(0)
M
7

Remember that on Unix systems the binary may have been removed since it was started. It's perfectly legal and safe on Unix. Last I checked Windows will not allow you to remove a running binary.

/proc/self/exe will still be readable, but it will not be a working symlink really. It will be... odd.

Margiemargin answered 1/6, 2009 at 9:18 Comment(4)
do Unix load all the executable program into memory before running? In that case how can it has enough memory to run very large programs such as some self extracting archivesInerrable
Generally no. The executable is locked (trying to open for write gives "text file busy") and "memory mapped", meaning it looks like it's all in memory, but it will be lazily loaded the first time a memory page is accessed. If it's a read-only page (as code tends to be) then the kernel can "forget" the data if it needs the memory, and re-load it when it gets accessed again. Sort of like swapping, but since code isn't modified it will never be written back to disk.Margiemargin
I suppose in that case if the unloaded memory page is loaded into memory after the file has been deleted, serious thing will happenInerrable
Nothing will happen. IIRC, you don't "delete" a file, you merely remove its entry in a directory. The actual file will be deleted automatically as soon as all references to it vanish. References can be file system entries (hard links) or open file descriptors (like being a process' image). So the file will be invisible to you after you "delete" it, but the actual deletion will be deferred until the process terminates.Wigan
B
6

In many POSIX systems you could check a simlink located under /proc/PID/exe. Few examples:

# file /proc/*/exe
/proc/1001/exe: symbolic link to /usr/bin/distccd
/proc/1023/exe: symbolic link to /usr/sbin/sendmail.sendmail
/proc/1043/exe: symbolic link to /usr/sbin/crond
Bagger answered 1/6, 2009 at 7:37 Comment(6)
/proc is not POSIX, and it's not very standardized. Many modern Unices have it, some don't.Aviculture
Always good to learn new things. Thanks. Is there more "programmatic" way to do this?Tamathatamaulipas
@Dietrich: you're right, it's not posix. According to Wikipedia unix-like systems having it are: Linux, AIX, BSD, Solaris, QNX. It however it's not stated whether all those systems have /proc/*/cmd simlink.Bagger
Solaris does have /proc, but doesn't have /proc/*/cmd.Epifocal
@unknown(google): check out man 2:3 readlink or here: linux.die.net/man/3/readlinkBagger
In Linux /proc filesystem is a kernel level option- so it can't be guaranteed to be available or enabled on any given Linux system. Also, it could be enabled in the kernel but not available if /etc/fstab does not have a mount point for it. Also, you may run into security issues.Pronunciamento
M
4

On Mac OS X, use _NSGetExecutablePath.

See man 3 dyld and this answer to a similar question.

Monteith answered 19/9, 2014 at 17:42 Comment(0)
B
3

For Linux you can find the /proc/self/exe way of doing things bundled up in a nice library called binreloc, you can find the library at:

Borroff answered 5/8, 2009 at 21:40 Comment(0)
B
1

I would

  1. Use the dirname() function: http://linux.die.net/man/3/dirname
  2. chdir() to that directory
  3. Use getcwd() to get the current directory

That way you'll get the directory in a neat, full form, instead of ./ or ../bin/.

Maybe you'll want to save and restore the current directory, if that is important for your program.

Billibilliard answered 19/6, 2009 at 19:3 Comment(4)
To save the current directory, open(".") and return to it later using fchdir().Fen
realpath() can turn a relative pathname to absolute and resolve symbolic links.Fen
What is this function getpwd?Hessney
@Hessney typo! Its getcwd(). This answer has stayed wrong for 14 years. Also, dirname(), not basename().Billibilliard

© 2022 - 2024 — McMap. All rights reserved.