Finding current executable's path without /proc/self/exe
Asked Answered
H

15

237

It seems to me that Linux has it easy with /proc/self/exe. But I'd like to know if there is a convenient way to find the current application's directory in C/C++ with cross-platform interfaces. I've seen some projects mucking around with argv[0], but it doesn't seem entirely reliable.

If you ever had to support, say, Mac OS X, which doesn't have /proc/, what would you have done? Use #ifdefs to isolate the platform-specific code (NSBundle, for example)? Or try to deduce the executable's path from argv[0], $PATH and whatnot, risking finding bugs in edge cases?

Hecker answered 21/6, 2009 at 6:12 Comment(7)
Duplicate: stackoverflow.com/questions/933850/…Morehead
I googled: get my ps -o comm. What brought me here is: "/proc/pid/path/a.out"Hushaby
IMHO prideout's answer deserves to be on top, because it correctly addresses the "cross-platform interfaces" requirement and is very easy to integrate.Snavely
In C++17 you can use std::current_path(). I don't see a portable C-style method as different OS use different string formats e.g., wchar_t, UTF-8, etc.Zygospore
@Zygospore std::filesystem::current_path() does not give the path to the current executable. It'll give you the current working directory.Princessprinceton
@TedLyngmo well, usually those two are the same... but yeah.Zygospore
@Zygospore No, the current directory is never the same as the executable.Princessprinceton
C
394

Some OS-specific interfaces:

There are also third party libraries that can be used to get this information, such as whereami as mentioned in prideout's answer, or if you are using Qt, QCoreApplication::applicationFilePath() as mentioned in the comments.

The portable (but less reliable) method is to use argv[0]. Although it could be set to anything by the calling program, by convention it is set to either a path name of the executable or a name that was found using $PATH.

Some shells, including bash and ksh, set the environment variable "_" to the full path of the executable before it is executed. In that case you can use getenv("_") to get it. However this is unreliable because not all shells do this, and it could be set to anything or be left over from a parent process which did not change it before executing your program.

Centaury answered 21/6, 2009 at 22:56 Comment(12)
Thanks @stepancheg; added. However it looks like procfs is not mounted by default on FreeBSD so I have linked to another question mentioning sysctl.Centaury
And also note that _NSGetExecutablePath() doesn't follow symlinks.Indiscipline
NetBSD: readlink /proc/curproc/exe DragonFly BSD: readlink /proc/curproc/fileIndiscipline
Solaris: char exepath[MAXPATHLEN]; sprintf(exepath, "/proc/%d/path/a.out", getpid()); readlink(exepath, exepath, sizeof(exepath));; that's different from getexecname() - which does the equiv of pargs -x <PID> | grep AT_SUN_EXECNAME ...Gaudreau
If you're able to use Qt, they've covered it: QDesktopServices::storageLocation(QDesktopServices::DataLocation) . See qt-project.org/doc/qt-4.8/qdesktopservices.htmlLucindalucine
"QDesktopServices::storageLocation(QDesktopServices::DataLocation)" That's not the executable's path, that's the pathname of the per-user directory where data should be stored.Counterstamp
QCoreApplication::applicationFilePath() can be used for Qt (4.x, 5.x) based applications, which as per documentation abstracts interfaces enumerated in the answer.Lenalenard
OpenBSD is the only one where you still can't in 2017. You have to use the PATH and argv[0] waySherise
Also Solaris: /proc/self/path/a.out. Not sure which versions are the oldest to support this.Maxia
@mark4o: Do you mind if I edit your answer to point to prideout's answer that shows how to achieve this with a readily available library?Resurrect
@krlmlr: Added, along with Qt as mentioned above.Centaury
I'm using GTK. Is there no cross-platform function in GTK available??Ganister
P
42

The use of /proc/self/exe is non-portable and unreliable. On my Ubuntu 12.04 system, you must be root to read/follow the symlink. This will make the Boost example and probably the whereami() solutions posted fail.

This post is very long but discusses the actual issues and presents code which actually works along with validation against a test suite.

The best way to find your program is to retrace the same steps the system uses. This is done by using argv[0] resolved against file system root, pwd, path environment and considering symlinks, and pathname canonicalization. This is from memory but I have done this in the past successfully and tested it in a variety of different situations. It is not guaranteed to work, but if it doesn't you probably have much bigger problems and it is more reliable overall than any of the other methods discussed. There are situations on a Unix compatible system in which proper handling of argv[0] will not get you to your program but then you are executing in a certifiably broken environment. It is also fairly portable to all Unix derived systems since around 1970 and even some non-Unix derived systems as it basically relies on libc() standard functionality and standard command line functionality. It should work on Linux (all versions), Android, Chrome OS, Minix, original Bell Labs Unix, FreeBSD, NetBSD, OpenBSD, BSD x.x, SunOS, Solaris, SYSV, HP-UX, Concentrix, SCO, Darwin, AIX, OS X, NeXTSTEP, etc. And with a little modification probably VMS, VM/CMS, DOS/Windows, ReactOS, OS/2, etc. If a program was launched directly from a GUI environment, it should have set argv[0] to an absolute path.

Understand that almost every shell on every Unix compatible operating system that has ever been released basically finds programs the same way and sets up the operating environment almost the same way (with some optional extras). And any other program that launches a program is expected to create the same environment (argv, environment strings, etc.) for that program as if it were run from a shell, with some optional extras. A program or user can setup an environment that deviates from this convention for other subordinate programs that it launches but if it does, this is a bug and the program has no reasonable expectation that the subordinate program or its subordinates will function correctly.

Possible values of argv[0] include:

  • /path/to/executable — absolute path
  • ../bin/executable — relative to pwd
  • bin/executable — relative to pwd
  • ./foo — relative to pwd
  • executable — basename, find in path
  • bin//executable — relative to pwd, non-canonical
  • src/../bin/executable — relative to pwd, non-canonical, backtracking
  • bin/./echoargc — relative to pwd, non-canonical

Values you should not see:

  • ~/bin/executable — rewritten before your program runs.
  • ~user/bin/executable — rewritten before your program runs
  • alias — rewritten before your program runs
  • $shellvariable — rewritten before your program runs
  • *foo* — wildcard, rewritten before your program runs, not very useful
  • ?foo? — wildcard, rewritten before your program runs, not very useful

In addition, these may contain non-canonical path names and multiple layers of symbolic links. In some cases, there may be multiple hard links to the same program. For example, /bin/ls, /bin/ps, /bin/chmod, /bin/rm, etc. may be hard links to /bin/busybox.

To find yourself, follow the steps below:

  • Save pwd, PATH, and argv[0] on entry to your program (or initialization of your library) as they may change later.

  • Optional: particularly for non-Unix systems, separate out but don't discard the pathname host/user/drive prefix part, if present; the part which often precedes a colon or follows an initial "//".

  • If argv[0] is an absolute path, use that as a starting point. An absolute path probably starts with "/" but on some non-Unix systems it might start with "" or a drive letter or name prefix followed by a colon.

  • Else if argv[0] is a relative path (contains "/" or "" but doesn't start with it, such as "../../bin/foo", then combine pwd+"/"+argv[0] (use present working directory from when program started, not current).

  • Else if argv[0] is a plain basename (no slashes), then combine it with each entry in PATH environment variable in turn and try those and use the first one which succeeds.

  • Optional: Else try the very platform specific /proc/self/exe, /proc/curproc/file (BSD), and (char *)getauxval(AT_EXECFN), and dlgetname(...) if present. You might even try these before argv[0]-based methods, if they are available and you don't encounter permission issues. In the somewhat unlikely event (when you consider all versions of all systems) that they are present and don't fail, they might be more authoritative.

  • Optional: check for a path name passed in using a command line parameter.

  • Optional: check for a pathname in the environment explicitly passed in by your wrapper script, if any.

  • Optional: As a last resort try environment variable "_". It might point to a different program entirely, such as the users shell.

  • Resolve symlinks, there may be multiple layers. There is the possibility of infinite loops, though if they exist your program probably won't get invoked.

  • Canonicalize filename by resolving substrings like "/foo/../bar/" to "/bar/". Note this may potentially change the meaning if you cross a network mount point, so canonization is not always a good thing. On a network server, ".." in symlink may be used to traverse a path to another file in the server context instead of on the client. In this case, you probably want the client context so canonicalization is ok. Also convert patterns like "/./" to "/" and "//" to "/". In shell, readlink --canonicalize will resolve multiple symlinks and canonicalize name. Chase may do similar but isn't installed. realpath() or canonicalize_file_name(), if present, may help.

If realpath() doesn't exist at compile time, you might borrow a copy from a permissively licensed library distribution, and compile it in yourself rather than reinventing the wheel. Fix the potential buffer overflow (pass in sizeof output buffer, think strncpy() vs strcpy()) if you will be using a buffer less than PATH_MAX. It may be easier just to use a renamed private copy rather than testing if it exists. Permissive license copy from android/darwin/bsd: https://android.googlesource.com/platform/bionic/+/f077784/libc/upstream-freebsd/lib/libc/stdlib/realpath.c

Be aware that multiple attempts may be successful or partially successful and they might not all point to the same executable, so consider verifying your executable; however, you may not have read permission — if you can't read it, don't treat that as a failure. Or verify something in proximity to your executable such as the "../lib/" directory you are trying to find. You may have multiple versions, packaged and locally compiled versions, local and network versions, and local and USB-drive portable versions, etc. and there is a small possibility that you might get two incompatible results from different methods of locating. And "_" may simply point to the wrong program.

A program using execve can deliberately set argv[0] to be incompatible with the actual path used to load the program and corrupt PATH, "_", pwd, etc. though there isn't generally much reason to do so; but this could have security implications if you have vulnerable code that ignores the fact that your execution environment can be changed in variety of ways including, but not limited, to this one (chroot, fuse filesystem, hard links, etc.) It is possible for shell commands to set PATH but fail to export it.

You don't necessarily need to code for non-Unix systems but it would be a good idea to be aware of some of the peculiarities so you can write the code in such a way that it isn't as hard for someone to port later. Be aware that some systems (DEC VMS, DOS, URLs, etc.) might have drive names or other prefixes which end with a colon such as "C:", "sys$drive:[foo]bar", and "file:///foo/bar/baz". Old DEC VMS systems use "[" and "]" to enclose the directory portion of the path though this may have changed if your program is compiled in a POSIX environment. Some systems, such as VMS, may have a file version (separated by a semicolon at the end). Some systems use two consecutive slashes as in "//drive/path/to/file" or "user@host:/path/to/file" (scp command) or "file://hostname/path/to/file" (URL). In some cases (DOS and Windows), PATH might have different separator characters — ";" vs ":" and "" vs "/" for a path separator. In csh/tsh there is "path" (delimited with spaces) and "PATH" delimited with colons but your program should receive PATH so you don't need to worry about path. DOS and some other systems can have relative paths that start with a drive prefix. C:foo.exe refers to foo.exe in the current directory on drive C, so you do need to lookup current directory on C: and use that for pwd.

An example of symlinks and wrappers on my system:

/usr/bin/google-chrome is symlink to
/etc/alternatives/google-chrome  which is symlink to
/usr/bin/google-chrome-stable which is symlink to
/opt/google/chrome/google-chrome which is a bash script which runs
/opt/google/chome/chrome

Note that user bill posted a link above to a program at HP that handles the three basic cases of argv[0]. It needs some changes, though:

  • It will be necessary to rewrite all the strcat() and strcpy() to use strncat() and strncpy(). Even though the variables are declared of length PATHMAX, an input value of length PATHMAX-1 plus the length of concatenated strings is > PATHMAX and an input value of length PATHMAX would be unterminated.
  • It needs to be rewritten as a library function, rather than just to print out results.
  • It fails to canonicalize names (use the realpath code I linked to above)
  • It fails to resolve symbolic links (use the realpath code)

So, if you combine both the HP code and the realpath code and fix both to be resistant to buffer overflows, then you should have something which can properly interpret argv[0].

The following illustrates actual values of argv[0] for various ways of invoking the same program on Ubuntu 12.04. And yes, the program was accidentally named echoargc instead of echoargv. This was done using a script for clean copying but doing it manually in shell gets same results (except aliases don't work in script unless you explicitly enable them).

cat ~/src/echoargc.c
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
main(int argc, char **argv)
{
  printf("  argv[0]=\"%s\"\n", argv[0]);
  sleep(1);  /* in case run from desktop */
}
tcc -o ~/bin/echoargc ~/src/echoargc.c
cd ~
/home/whitis/bin/echoargc
  argv[0]="/home/whitis/bin/echoargc"
echoargc
  argv[0]="echoargc"
bin/echoargc
  argv[0]="bin/echoargc"
bin//echoargc
  argv[0]="bin//echoargc"
bin/./echoargc
  argv[0]="bin/./echoargc"
src/../bin/echoargc
  argv[0]="src/../bin/echoargc"
cd ~/bin
*echo*
  argv[0]="echoargc"
e?hoargc
  argv[0]="echoargc"
./echoargc
  argv[0]="./echoargc"
cd ~/src
../bin/echoargc
  argv[0]="../bin/echoargc"
cd ~/junk
~/bin/echoargc
  argv[0]="/home/whitis/bin/echoargc"
~whitis/bin/echoargc
  argv[0]="/home/whitis/bin/echoargc"
alias echoit=~/bin/echoargc
echoit
  argv[0]="/home/whitis/bin/echoargc"
echoarg=~/bin/echoargc
$echoarg
  argv[0]="/home/whitis/bin/echoargc"
ln -s ~/bin/echoargc junk1
./junk1
  argv[0]="./junk1"
ln -s /home/whitis/bin/echoargc junk2
./junk2
  argv[0]="./junk2"
ln -s junk1 junk3
./junk3
  argv[0]="./junk3"


gnome-desktop-item-edit --create-new ~/Desktop
# interactive, create desktop link, then click on it
  argv[0]="/home/whitis/bin/echoargc"
# interactive, right click on gnome application menu, pick edit menus
# add menu item for echoargc, then run it from gnome menu
 argv[0]="/home/whitis/bin/echoargc"

 cat ./testargcscript 2>&1 | sed -e 's/^/    /g'
#!/bin/bash
# echoargc is in ~/bin/echoargc
# bin is in path
shopt -s expand_aliases
set -v
cat ~/src/echoargc.c
tcc -o ~/bin/echoargc ~/src/echoargc.c
cd ~
/home/whitis/bin/echoargc
echoargc
bin/echoargc
bin//echoargc
bin/./echoargc
src/../bin/echoargc
cd ~/bin
*echo*
e?hoargc
./echoargc
cd ~/src
../bin/echoargc
cd ~/junk
~/bin/echoargc
~whitis/bin/echoargc
alias echoit=~/bin/echoargc
echoit
echoarg=~/bin/echoargc
$echoarg
ln -s ~/bin/echoargc junk1
./junk1
ln -s /home/whitis/bin/echoargc junk2
./junk2
ln -s junk1 junk3
./junk3

These examples illustrate that the techniques described in this post should work in a wide range of circumstances and why some of the steps are necessary.

EDIT: Now, the program that prints argv[0] has been updated to actually find itself.

// Copyright 2015 by Mark Whitis.  License=MIT style
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <limits.h>
#include <assert.h>
#include <string.h>
#include <errno.h>

// "look deep into yourself, Clarice"  -- Hanibal Lector
char findyourself_save_pwd[PATH_MAX];
char findyourself_save_argv0[PATH_MAX];
char findyourself_save_path[PATH_MAX];
char findyourself_path_separator='/';
char findyourself_path_separator_as_string[2]="/";
char findyourself_path_list_separator[8]=":";  // could be ":; "
char findyourself_debug=0;

int findyourself_initialized=0;

void findyourself_init(char *argv0)
{

  getcwd(findyourself_save_pwd, sizeof(findyourself_save_pwd));

  strncpy(findyourself_save_argv0, argv0, sizeof(findyourself_save_argv0));
  findyourself_save_argv0[sizeof(findyourself_save_argv0)-1]=0;

  strncpy(findyourself_save_path, getenv("PATH"), sizeof(findyourself_save_path));
  findyourself_save_path[sizeof(findyourself_save_path)-1]=0;
  findyourself_initialized=1;
}


int find_yourself(char *result, size_t size_of_result)
{
  char newpath[PATH_MAX+256];
  char newpath2[PATH_MAX+256];

  assert(findyourself_initialized);
  result[0]=0;

  if(findyourself_save_argv0[0]==findyourself_path_separator) {
    if(findyourself_debug) printf("  absolute path\n");
     realpath(findyourself_save_argv0, newpath);
     if(findyourself_debug) printf("  newpath=\"%s\"\n", newpath);
     if(!access(newpath, F_OK)) {
        strncpy(result, newpath, size_of_result);
        result[size_of_result-1]=0;
        return(0);
     } else {
    perror("access failed 1");
      }
  } else if( strchr(findyourself_save_argv0, findyourself_path_separator )) {
    if(findyourself_debug) printf("  relative path to pwd\n");
    strncpy(newpath2, findyourself_save_pwd, sizeof(newpath2));
    newpath2[sizeof(newpath2)-1]=0;
    strncat(newpath2, findyourself_path_separator_as_string, sizeof(newpath2));
    newpath2[sizeof(newpath2)-1]=0;
    strncat(newpath2, findyourself_save_argv0, sizeof(newpath2));
    newpath2[sizeof(newpath2)-1]=0;
    realpath(newpath2, newpath);
    if(findyourself_debug) printf("  newpath=\"%s\"\n", newpath);
    if(!access(newpath, F_OK)) {
        strncpy(result, newpath, size_of_result);
        result[size_of_result-1]=0;
        return(0);
     } else {
    perror("access failed 2");
      }
  } else {
    if(findyourself_debug) printf("  searching $PATH\n");
    char *saveptr;
    char *pathitem;
    for(pathitem=strtok_r(findyourself_save_path, findyourself_path_list_separator,  &saveptr); pathitem; pathitem=strtok_r(NULL, findyourself_path_list_separator, &saveptr) ) {
       if(findyourself_debug>=2) printf("pathitem=\"%s\"\n", pathitem);
       strncpy(newpath2, pathitem, sizeof(newpath2));
       newpath2[sizeof(newpath2)-1]=0;
       strncat(newpath2, findyourself_path_separator_as_string, sizeof(newpath2));
       newpath2[sizeof(newpath2)-1]=0;
       strncat(newpath2, findyourself_save_argv0, sizeof(newpath2));
       newpath2[sizeof(newpath2)-1]=0;
       realpath(newpath2, newpath);
       if(findyourself_debug) printf("  newpath=\"%s\"\n", newpath);
      if(!access(newpath, F_OK)) {
          strncpy(result, newpath, size_of_result);
          result[size_of_result-1]=0;
          return(0);
      }
    } // end for
    perror("access failed 3");

  } // end else
  // if we get here, we have tried all three methods on argv[0] and still haven't succeeded.   Include fallback methods here.
  return(1);
}

main(int argc, char **argv)
{
  findyourself_init(argv[0]);

  char newpath[PATH_MAX];
  printf("  argv[0]=\"%s\"\n", argv[0]);
  realpath(argv[0], newpath);
  if(strcmp(argv[0],newpath)) { printf("  realpath=\"%s\"\n", newpath); }
  find_yourself(newpath, sizeof(newpath));
  if(1 || strcmp(argv[0],newpath)) { printf("  findyourself=\"%s\"\n", newpath); }
  sleep(1);  /* in case run from desktop */
}

And here is the output which demonstrates that in every one of the previous tests it actually did find itself.

tcc -o ~/bin/echoargc ~/src/echoargc.c
cd ~
/home/whitis/bin/echoargc
  argv[0]="/home/whitis/bin/echoargc"
  findyourself="/home/whitis/bin/echoargc"
echoargc
  argv[0]="echoargc"
  realpath="/home/whitis/echoargc"
  findyourself="/home/whitis/bin/echoargc"
bin/echoargc
  argv[0]="bin/echoargc"
  realpath="/home/whitis/bin/echoargc"
  findyourself="/home/whitis/bin/echoargc"
bin//echoargc
  argv[0]="bin//echoargc"
  realpath="/home/whitis/bin/echoargc"
  findyourself="/home/whitis/bin/echoargc"
bin/./echoargc
  argv[0]="bin/./echoargc"
  realpath="/home/whitis/bin/echoargc"
  findyourself="/home/whitis/bin/echoargc"
src/../bin/echoargc
  argv[0]="src/../bin/echoargc"
  realpath="/home/whitis/bin/echoargc"
  findyourself="/home/whitis/bin/echoargc"
cd ~/bin
*echo*
  argv[0]="echoargc"
  realpath="/home/whitis/bin/echoargc"
  findyourself="/home/whitis/bin/echoargc"
e?hoargc
  argv[0]="echoargc"
  realpath="/home/whitis/bin/echoargc"
  findyourself="/home/whitis/bin/echoargc"
./echoargc
  argv[0]="./echoargc"
  realpath="/home/whitis/bin/echoargc"
  findyourself="/home/whitis/bin/echoargc"
cd ~/src
../bin/echoargc
  argv[0]="../bin/echoargc"
  realpath="/home/whitis/bin/echoargc"
  findyourself="/home/whitis/bin/echoargc"
cd ~/junk
~/bin/echoargc
  argv[0]="/home/whitis/bin/echoargc"
  findyourself="/home/whitis/bin/echoargc"
~whitis/bin/echoargc
  argv[0]="/home/whitis/bin/echoargc"
  findyourself="/home/whitis/bin/echoargc"
alias echoit=~/bin/echoargc
echoit
  argv[0]="/home/whitis/bin/echoargc"
  findyourself="/home/whitis/bin/echoargc"
echoarg=~/bin/echoargc
$echoarg
  argv[0]="/home/whitis/bin/echoargc"
  findyourself="/home/whitis/bin/echoargc"
rm junk1 junk2 junk3
ln -s ~/bin/echoargc junk1
./junk1
  argv[0]="./junk1"
  realpath="/home/whitis/bin/echoargc"
  findyourself="/home/whitis/bin/echoargc"
ln -s /home/whitis/bin/echoargc junk2
./junk2
  argv[0]="./junk2"
  realpath="/home/whitis/bin/echoargc"
  findyourself="/home/whitis/bin/echoargc"
ln -s junk1 junk3
./junk3
  argv[0]="./junk3"
  realpath="/home/whitis/bin/echoargc"
  findyourself="/home/whitis/bin/echoargc"

The two GUI launches described above also correctly find the program.

There is one potential pitfall. The access() function drops permissions if the program is setuid before testing. If there is a situation where the program can be found as an elevated user but not as a regular user, then there might be a situation where these tests would fail, although it is unlikely the program could actually be executed under those circumstances. One could use euidaccess() instead. It is possible, however, that it might find an inaccessable program earlier on path than the actual user could.

Pitman answered 14/12, 2015 at 16:25 Comment(5)
You put a lot of effort into that — well done. Unfortunately, neither strncpy() nor (especially) strncat() is used safely in the code. strncpy() does not guarantee null termination; if the source string is longer than the target space, the string is not null terminated. strncat() is very hard to use; strncat(target, source, sizeof(target)) is wrong (even if target is an empty string to start with) if source is longer than the target. The length is the number of characters that can safely be appended to the target excluding the trailing null, so sizeof(target)-1 is the maximum.Transfusion
The strncpy code is correct, unlike the method you imply I should use. I suggest you read the code more carefully. It neither overflows buffers nor leaves them unterminated. Each use of strncpy()/stncat() is limited by copying sizeof(buffer), which is valid, and then the last character of the buffer is filled with a zero overwriting the last character of the buffer. strncat(), however, uses the size parameter incorrectly as a count and may overflow due to the fact that it predates buffer overflow attacks.Pitman
"sudo apt-get install libbsd0 libbsd-dev", then s/strncat/strlcat/Pitman
Do not use PATH_MAX. This stopped working 30 years ago, always use malloc.Sherise
Also if you use an init call. Fully resolve the path to the exe on init and not just a part and then do it later on call. No lazy evaluation here possible if you use realpath in the resolver. Together with the other erorrs simply the worst code i've seen on stackoverflow in a long answer.Sherise
E
30

The whereami library by Gregory Pakosz implements this for a variety of platforms, using the APIs mentioned in mark4o's post. This is most interesting if you "just" need a solution that works for a portable project and are not interested in the peculiarities of the various platforms.

At the time of writing, supported platforms are:

  • Windows
  • Linux
  • Mac
  • iOS
  • Android
  • QNX Neutrino
  • FreeBSD
  • NetBSD
  • DragonFly BSD
  • SunOS

The library consists of whereami.c and whereami.h and is licensed under MIT and WTFPL2. Drop the files into your project, include the header and use it:

#include "whereami.h"

int main() {
  int length = wai_getExecutablePath(NULL, 0, NULL);
  char* path = (char*)malloc(length + 1);
  wai_getExecutablePath(path, length, &dirname_length);
  path[length] = '\0';

  printf("My path: %s", path);

  free(path);
  return 0;
}
Evansville answered 26/9, 2015 at 4:14 Comment(0)
A
13

An alternative on Linux to using either /proc/self/exe or argv[0] is using the information passed by the ELF interpreter, made available by glibc as such:

#include <stdio.h>
#include <sys/auxv.h>

int main(int argc, char **argv)
{
    printf("%s\n", (char *)getauxval(AT_EXECFN));
    return(0);
}

Note that getauxval is a glibc extension, and to be robust you should check so that it doesn't return NULL (indicating that the ELF interpreter hasn't provided the AT_EXECFN parameter), but I don't think this is ever actually a problem on Linux.

Aim answered 13/5, 2014 at 0:52 Comment(5)
I like this since it's simple and glibc is included with Gtk+ anyway (which I'm using).Spiro
if I start my process with ./build/foo, your solution returns: ./build/foo, which is plain useless. (debian bullseye).Hattie
@BitTickler: Is it? I'd say the most common use for locating the executable is to access files relative to add, and there's no problem doing that with a relative path. And if you need an absolute path for any reason, it's not like you couldn't absolutize it.Aim
My use case was a gtk application with some resource files. While trying to get it started via gnome favorites (no idea what their current directory is...) I needed to build an absolute path to my resources (which are one folder up from my meson build directory with the executable...). So, if I do not know what . is, I actually cannot absolutize the path... I ended up using the readlink solution which returns an absolute path.Hattie
@BitTickler: You always know what . is, just call getcwd(). Even if you didn't, however, and you just need to go up a directory level, there's always ..Aim
H
8

Making this work reliably across platforms requires using #ifdef statements.

The below code finds the executable's path in Windows, Linux, MacOS, Solaris or FreeBSD (although FreeBSD is untested). It uses Boost 1.55.0 (or later) to simplify the code, but it's easy enough to remove if you want. Just use defines like _MSC_VER and __linux as the OS and compiler require.

#include <string>
#include <boost/predef/os.h>

#if (BOOST_OS_WINDOWS)
#  include <stdlib.h>
#elif (BOOST_OS_SOLARIS)
#  include <stdlib.h>
#  include <limits.h>
#elif (BOOST_OS_LINUX)
#  include <unistd.h>
#  include <limits.h>
#elif (BOOST_OS_MACOS)
#  include <mach-o/dyld.h>
#elif (BOOST_OS_BSD_FREE)
#  include <sys/types.h>
#  include <sys/sysctl.h>
#endif

/*
 * Returns the full path to the currently running executable,
 * or an empty string in case of failure.
 */
std::string getExecutablePath() {
    #if (BOOST_OS_WINDOWS)
        char *exePath;
        if (_get_pgmptr(&exePath) != 0)
            exePath = "";
    #elif (BOOST_OS_SOLARIS)
        char exePath[PATH_MAX];
        if (realpath(getexecname(), exePath) == NULL)
            exePath[0] = '\0';
    #elif (BOOST_OS_LINUX)
        char exePath[PATH_MAX];
        ssize_t len = ::readlink("/proc/self/exe", exePath, sizeof(exePath));
        if (len == -1 || len == sizeof(exePath))
            len = 0;
        exePath[len] = '\0';
    #elif (BOOST_OS_MACOS)
        char exePath[PATH_MAX];
        uint32_t len = sizeof(exePath);
        if (_NSGetExecutablePath(exePath, &len) != 0) {
            exePath[0] = '\0'; // buffer too small (!)
        } else {
            // resolve symlinks, ., .. if possible
            char *canonicalPath = realpath(exePath, NULL);
            if (canonicalPath != NULL) {
                strncpy(exePath,canonicalPath,len);
                free(canonicalPath);
            }
        }
    #elif (BOOST_OS_BSD_FREE)
        char exePath[2048];
        int mib[4];  mib[0] = CTL_KERN;  mib[1] = KERN_PROC;  mib[2] = KERN_PROC_PATHNAME;  mib[3] = -1;
        size_t len = sizeof(exePath);
        if (sysctl(mib, 4, exePath, &len, NULL, 0) != 0)
            exePath[0] = '\0';
    #endif
        return std::string(exePath);
}

The above version returns full paths including the executable name. If instead you want the path without the executable name, #include boost/filesystem.hpp> and change the return statement to:

return strlen(exePath)>0 ? boost::filesystem::path(exePath).remove_filename().make_preferred().string() : std::string();
Hemolysis answered 21/10, 2015 at 0:29 Comment(2)
@Frank, not sure why you say that. Works for me. I saw another response claims you need root to access /proc/self/exe but I haven't found that on any Linux systems I've tried (CentOS or Mint).Hemolysis
Works for me on Linux (Ubuntu 20.04) and macOS 12.6 (arm). Thanks a lot!When
A
6

If you ever had to support, say, Mac OS X, which doesn't have /proc/, what would you have done? Use #ifdefs to isolate the platform-specific code (NSBundle, for example)?

Yes, isolating platform-specific code with #ifdefs is the conventional way this is done.

Another approach would be to have a have clean #ifdef-less header which contains function declarations and put the implementations in platform specific source files.

For example, check out how POCO (Portable Components) C++ library does something similar for their Environment class.

Apparatus answered 21/6, 2009 at 7:50 Comment(0)
H
3

AFAIK, no such way. And there is also an ambiguity: what would you like to get as the answer if the same executable has multiple hard-links "pointing" to it? (Hard-links don't actually "point", they are the same file, just at another place in the file system hierarchy.)

Once execve() successfully executes a new binary, all information about the arguments to the original program is lost.

Hyperion answered 21/6, 2009 at 7:14 Comment(1)
"Once execve() successfully executes a new binary, all information about its arguments is lost." Actually, the argp and envp arguments aren't lost, they're passed as argv[] and the environment and, in some UN*Xes, the pathname argument or something constructed from it is either passed along with argp and envp (OS X/iOS, Solaris) or made available through one of the mechanisms listed in mark4o's answer. But, yes, that just gives you one of the hard links if there's more than one.Counterstamp
G
3

Depending on the version of QNX Neutrino, there are different ways to find the full path and name of the executable file that was used to start the running process. I denote the process identifier as <PID>. Try the following:

  1. If the file /proc/self/exefile exists, then its contents are the requested information.
  2. If the file /proc/<PID>/exefile exists, then its contents are the requested information.
  3. If the file /proc/self/as exists, then:
    1. open() the file.
    2. Allocate a buffer of, at least, sizeof(procfs_debuginfo) + _POSIX_PATH_MAX.
    3. Give that buffer as input to devctl(fd, DCMD_PROC_MAPDEBUG_BASE,....
    4. Cast the buffer to a procfs_debuginfo*.
    5. The requested information is at the path field of the procfs_debuginfo structure. Warning: For some reason, sometimes, QNX omits the first slash / of the file path. Prepend that / when needed.
    6. Clean up (close the file, free the buffer, etc.).
  4. Try the procedure in 3. with the file /proc/<PID>/as.
  5. Try dladdr(dlsym(RTLD_DEFAULT, "main"), &dlinfo) where dlinfo is a Dl_info structure whose dli_fname might contain the requested information.

I hope this helps.

Geisler answered 13/11, 2015 at 15:52 Comment(2)
great comment but sadly doesn't work at all (QNX 6.5.0), use '__progname' as externAmbsace
This procedure works for QNX 6.5 and 6.6 on ARMv7-A. You might have missed something.Geisler
W
3

In addition to mark4o's answer, FreeBSD also has

const char* getprogname(void)

It should also be available in macOS. It's available in GNU/Linux through libbsd.

Wilmington answered 11/9, 2020 at 16:6 Comment(1)
It's available, but it doesn't return the full path, or even the relative path from the cwd.Pennipennie
J
1

Well, of course, this doesn't apply to all projects. Still, QCoreApplication::applicationFilePath() never failed me in 6 years of C++/Qt development.

Of course, one should read documentation thoroughly before attempting to use it:

Warning: On Linux, this function will try to get the path from the /proc file system. If that fails, it assumes that argv[0] contains the absolute file name of the executable. The function also assumes that the current directory has not been changed by the application.

To be honest, I think that #ifdef and other solutions like that shouldn't be used in modern code at all.

I'm sure that smaller cross-platform libraries also exist. Let them encapsulate all that platform-specific stuff inside.

Johniejohnna answered 11/9, 2020 at 9:3 Comment(1)
I'm using GTK, is there no GTK cross-platform solution available?Ganister
A
1

If you're writing GPLed code and using GNU autotools, then a portable way that takes care of the details on many OSes (including Windows and macOS) is gnulib's relocatable-prog module.

Absolution answered 30/1, 2021 at 20:48 Comment(2)
The module documentation says: "In every program, add to main as the first statement (even before setting the locale or doing anything related to libintl): set_program_name (argv[0]); The prototype for this function is in progname.h.". That means programmer intervention is required. It is not clear that's what's intended.Transfusion
@JonathanLeffler You have to call set_program_name, yes. Not sure what you mean by "it's not clear that's what's intended".Absolution
C
0

You can use argv[0] and analyze the PATH environment variable. Look at : A sample of a program that can find itself

Climax answered 21/6, 2009 at 7:30 Comment(2)
This isn't actually reliable (though it will generally work with programs launched by the usual shells), because execv and kin take the path to the executable seperately of argvTriumphant
This is an incorrect answer. It might tell you where you could find a program with the same name. But it doesn't tell you anything about where the currently running executable actually lives.Tracheitis
U
0

Just my two cents. You can find the current application's directory in C/C++ with cross-platform interfaces by using this code.

void getExecutablePath(char ** path, unsigned int * pathLength)
{
    // Early exit when invalid out-parameters are passed
    if (!checkStringOutParameter(path, pathLength))
    {
        return;
    }

#if defined SYSTEM_LINUX

    // Preallocate PATH_MAX (e.g., 4096) characters and hope the executable path isn't longer (including null byte)
    char exePath[PATH_MAX];

    // Return written bytes, indicating if memory was sufficient
    int len = readlink("/proc/self/exe", exePath, PATH_MAX);

    if (len <= 0 || len == PATH_MAX) // memory not sufficient or general error occured
    {
        invalidateStringOutParameter(path, pathLength);
        return;
    }

    // Copy contents to caller, create caller ownership
    copyToStringOutParameter(exePath, len, path, pathLength);

#elif defined SYSTEM_WINDOWS

    // Preallocate MAX_PATH (e.g., 4095) characters and hope the executable path isn't longer (including null byte)
    char exePath[MAX_PATH];

    // Return written bytes, indicating if memory was sufficient
    unsigned int len = GetModuleFileNameA(GetModuleHandleA(0x0), exePath, MAX_PATH);
    if (len == 0) // memory not sufficient or general error occured
    {
        invalidateStringOutParameter(path, pathLength);
        return;
    }

    // Copy contents to caller, create caller ownership
    copyToStringOutParameter(exePath, len, path, pathLength);

#elif defined SYSTEM_SOLARIS

    // Preallocate PATH_MAX (e.g., 4096) characters and hope the executable path isn't longer (including null byte)
    char exePath[PATH_MAX];

    // Convert executable path to canonical path, return null pointer on error
    if (realpath(getexecname(), exePath) == 0x0)
    {
        invalidateStringOutParameter(path, pathLength);
        return;
    }

    // Copy contents to caller, create caller ownership
    unsigned int len = strlen(exePath);
    copyToStringOutParameter(exePath, len, path, pathLength);

#elif defined SYSTEM_DARWIN

    // Preallocate PATH_MAX (e.g., 4096) characters and hope the executable path isn't longer (including null byte)
    char exePath[PATH_MAX];

    unsigned int len = (unsigned int)PATH_MAX;

    // Obtain executable path to canonical path, return zero on success
    if (_NSGetExecutablePath(exePath, &len) == 0)
    {
        // Convert executable path to canonical path, return null pointer on error
        char * realPath = realpath(exePath, 0x0);

        if (realPath == 0x0)
        {
            invalidateStringOutParameter(path, pathLength);
            return;
        }

        // Copy contents to caller, create caller ownership
        unsigned int len = strlen(realPath);
        copyToStringOutParameter(realPath, len, path, pathLength);

        free(realPath);
    }
    else // len is initialized with the required number of bytes (including zero byte)
    {
        char * intermediatePath = (char *)malloc(sizeof(char) * len);

        // Convert executable path to canonical path, return null pointer on error
        if (_NSGetExecutablePath(intermediatePath, &len) != 0)
        {
            free(intermediatePath);
            invalidateStringOutParameter(path, pathLength);
            return;
        }

        char * realPath = realpath(intermediatePath, 0x0);

        free(intermediatePath);

        // Check if conversion to canonical path succeeded
        if (realPath == 0x0)
        {
            invalidateStringOutParameter(path, pathLength);
            return;
        }

        // Copy contents to caller, create caller ownership
        unsigned int len = strlen(realPath);
        copyToStringOutParameter(realPath, len, path, pathLength);

        free(realPath);
    }

#elif defined SYSTEM_FREEBSD

    // Preallocate characters and hope the executable path isn't longer (including null byte)
    char exePath[2048];

    unsigned int len = 2048;

    int mib[] = { CTL_KERN, KERN_PROC, KERN_PROC_PATHNAME, -1 };

    // Obtain executable path by syscall
    if (sysctl(mib, 4, exePath, &len, 0x0, 0) != 0)
    {
        invalidateStringOutParameter(path, pathLength);
        return;
    }

    // Copy contents to caller, create caller ownership
    copyToStringOutParameter(exePath, len, path, pathLength);

#else

    // If no OS could be detected ... degrade gracefully
    invalidateStringOutParameter(path, pathLength);

#endif
}

You can take a look in detail here.

Unsupportable answered 12/9, 2020 at 6:28 Comment(1)
As for the FREEBSD portion of your code: I think getprogname() is the more idiomatic way to do it. (#include <stdlib.h>, it being part of libc).Hattie
V
0

But I'd like to know if there is a convenient way to find the current application's directory in C/C++ with cross-platform interfaces.

You cannot do that (at least on Linux)

Since an executable could, during execution of a process running it, rename(2) its file path to a different directory (of the same file system). See also syscalls(2) and inode(7).

On Linux, an executable could even (in principle) remove(3) itself by calling unlink(2). The Linux kernel should then keep the file allocated till no process references it anymore. With proc(5) you could do weird things (e.g. rename(2) that /proc/self/exe file, etc...)

In other words, on Linux, the notion of a "current application's directory" does not make any sense.

Read also Advanced Linux Programming and Operating Systems: Three Easy Pieces for more.

Look also on OSDEV for several open source operating systems (including FreeBSD or GNU Hurd). Several of them provide an interface (API) close to POSIX ones.

Consider using (with permission) cross-platform C++ frameworks like Qt or POCO, perhaps contributing to them by porting them to your favorite OS.

Venge answered 12/9, 2020 at 6:36 Comment(1)
So if you have resource files relative to your executable, you hard code the paths to the resources in your code? :) Even if some (rather obscure) use cases exist, where a process runs something which did not come from any file on any path, the usual use case is that you want the absolute path to your executable in order to find other stuff, located relative to it.Hattie
E
-1

I have not found it in the standard where it is written to work, but in C++17 and onward, this works on all platforms I tested:

std::filesystem::canonical(argv[0])

Even though argv[0] contains different things depending on the platform, when canonicalized, it just works. It gives the absolute path to the executable, no matter the cwd.

Egocentric answered 20/4, 2023 at 10:49 Comment(1)
Only if the invocation isn't via the PATH (such as a binary in /usr/local/bin).Zicarelli

© 2022 - 2024 — McMap. All rights reserved.