Getting the highest allocated file descriptor
Asked Answered
B

6

51

Is there a portable way (POSIX) to get the highest allocated file descriptor number for the current process?

I know that there's a nice way to get the number on AIX, for example, but I'm looking for a portable method.

The reason I'm asking is that I want to close all open file descriptors. My program is a server which runs as root and forks and execs child programs for non-root users. Leaving the privileged file descriptors open in the child process is a security problem. Some file descriptors may be opened by code I cannot control (the C library, third party libraries, etc.), so I cannot rely on FD_CLOEXEC either.

Bobsledding answered 22/5, 2009 at 17:35 Comment(3)
Note that it would be better to just open all your files with the close-on-exec flag set so that they're automatically closed by any of the exec-family functions.Wolfhound
Modern glibc supports the "e" stdio.h FILE* open flag character to indicate FD_CLOEXEC treatment.Jemmy
Also worth noting that close-on-exec is not close-on-fork. If you fork an unprivileged child process to run a subroutine from the same program, the child process will inherit the file descriptors of the privileged parent. It's not trivial to remember to close all the ones you don't need. Close-on-exec does not help in this situation.Brachiate
C
71

While portable, closing all file descriptors up to sysconf(_SC_OPEN_MAX) is not reliable, because on most systems this call returns the current file descriptor soft limit, which could have been lowered below the highest used file descriptor. Another issue is that on many systems sysconf(_SC_OPEN_MAX) may return INT_MAX, which can cause this approach to be unacceptably slow. Unfortunately, there is no reliable, portable alternative that does not involve iterating over every possible non-negative int file descriptor.

Although not portable, most operating systems in common use today provide one or more of the following solutions to this problem:

  1. A library function to close all open file descriptors >= fd or within a range. This is the simplest solution for the common case of closing all file descriptors, although it cannot be used for much else. To close all file descriptors except for a certain set, dup2 can be used to move them to the low end beforehand, and to move them back afterward if necessary.

    • closefrom(fd) (Linux with glibc 2.34+, Solaris 9+, FreeBSD 7.3+, NetBSD 3.0+, OpenBSD 3.5+)

    • fcntl(fd, F_CLOSEM, 0) (AIX, IRIX, NetBSD)

    • close_range(lowfd, highfd, 0) (Linux kernel 5.9+ with glibc 2.34+, FreeBSD 12.2+)

  2. A library function to provide the maximum file descriptor currently in use by the process. To close all file descriptors above a certain number, either close all of them up to this maximum, or continually get and close the highest file descriptor in a loop until the low bound is reached. Which is more efficient depends on the file descriptor density.

    • fcntl(0, F_MAXFD) (NetBSD)

    • pstat_getproc(&ps, sizeof(struct pst_status), (size_t)0, (int)getpid())
      Returns information about the process, including the highest file descriptor currently open in ps.pst_highestfd. (HP-UX)

  3. A library function to list all file descriptors currently in use by the process. This is more flexible in that it allows for closing all file descriptors, finding the highest file descriptor, or doing just about anything else on every open file descriptor, possibly even those of another process. Example (OpenSSH)

    • proc_pidinfo(getpid(), PROC_PIDLISTFDS, 0, fdinfo_buf, sz) (macOS)
  4. The number of file descriptor slots currently allocated for a process provides an upper bound on the file descriptor numbers currently in use. Example (Ruby)

    • "FDSize:" line in /proc/pid/status or /proc/self/status (Linux)
  5. A directory containing an entry for each open file descriptor. This is similar to #3 except that it isn't a library function. This can be more complicated than the other approaches for the common uses, and can fail for a variety of reasons such as proc/fdescfs not mounted, a chroot environment, or no file descriptors available to open the directory (process or system limit). Therefore use of this approach is often combined with a fallback mechanism. Example (OpenSSH), another example (glib).

    • /proc/pid/fd/ or /proc/self/fd/ (Linux, Solaris, AIX, Cygwin, NetBSD)
      (AIX does not support "self")

    • /dev/fd/ (FreeBSD, macOS)

It can be difficult to handle all corner cases reliably with this approach. For example consider the situation where all file descriptors >= fd are to be closed, but all file descriptors < fd are used, the current process resource limit is fd, and there are file descriptors >= fd in use. Because the process resource limit has been reached the directory cannot be opened. If closing every file descriptor from fd through the resource limit or sysconf(_SC_OPEN_MAX) is used as a fallback, nothing will be closed.

Cromwell answered 27/5, 2009 at 23:25 Comment(4)
About approach 3: there are serious issues in using this between fork/exec in a multithreaded program because opendir() may call malloc() which may deadlock in this situation. I'm afraid that there is just no way to do what the question asked under Linux, and the devs won't do a thing about it: sourceware.org/bugzilla/show_bug.cgi?id=10353Upchurch
@medoc: glibc development underwent a major reorganization in 2012, and several previously-rejected things have now made it in under the new development model. It may be worthwhile to start a new discussion on the issue.Cromwell
glibc 2.34 (August 2021) has now added closefrom(), and close_range() as well. I've added those to the answer.Cromwell
On Mac OSX there is a posix_spawn attribute extension that can do this: POSIX_SPAWN_CLOEXEC_DEFAULTScrawny
N
13

The POSIX way is:

int maxfd=sysconf(_SC_OPEN_MAX);
for(int fd=3; fd<maxfd; fd++)
    close(fd);

(note that's closing from 3 up, to keep stdin/stdout/stderr open)

close() harmlessly returns EBADF if the file descriptor is not open. There's no need to waste another system call checking.

Some Unixes support a closefrom(). This avoids the excessive number of calls to close() depending on the maximum possible file descriptor number. While the best solution I'm aware of, it's completely nonportable.

Negress answered 22/5, 2009 at 19:15 Comment(0)
M
6

I've written code to deal with all platform-specific features. All functions are async-signal safe. Thought people might find this useful. Only tested on OS X right now, feel free to improve/fix.

// Async-signal safe way to get the current process's hard file descriptor limit.
static int
getFileDescriptorLimit() {
    long long sysconfResult = sysconf(_SC_OPEN_MAX);

    struct rlimit rl;
    long long rlimitResult;
    if (getrlimit(RLIMIT_NOFILE, &rl) == -1) {
        rlimitResult = 0;
    } else {
        rlimitResult = (long long) rl.rlim_max;
    }

    long result;
    if (sysconfResult > rlimitResult) {
        result = sysconfResult;
    } else {
        result = rlimitResult;
    }
    if (result < 0) {
        // Both calls returned errors.
        result = 9999;
    } else if (result < 2) {
        // The calls reported broken values.
        result = 2;
    }
    return result;
}

// Async-signal safe function to get the highest file
// descriptor that the process is currently using.
// See also https://mcmap.net/q/345943/-getting-the-highest-allocated-file-descriptor
static int
getHighestFileDescriptor() {
#if defined(F_MAXFD)
    int ret;

    do {
        ret = fcntl(0, F_MAXFD);
    } while (ret == -1 && errno == EINTR);
    if (ret == -1) {
        ret = getFileDescriptorLimit();
    }
    return ret;

#else
    int p[2], ret, flags;
    pid_t pid = -1;
    int result = -1;

    /* Since opendir() may not be async signal safe and thus may lock up
     * or crash, we use it in a child process which we kill if we notice
     * that things are going wrong.
     */

    // Make a pipe.
    p[0] = p[1] = -1;
    do {
        ret = pipe(p);
    } while (ret == -1 && errno == EINTR);
    if (ret == -1) {
        goto done;
    }

    // Make the read side non-blocking.
    do {
        flags = fcntl(p[0], F_GETFL);
    } while (flags == -1 && errno == EINTR);
    if (flags == -1) {
        goto done;
    }
    do {
        fcntl(p[0], F_SETFL, flags | O_NONBLOCK);
    } while (ret == -1 && errno == EINTR);
    if (ret == -1) {
        goto done;
    }

    do {
        pid = fork();
    } while (pid == -1 && errno == EINTR);

    if (pid == 0) {
        // Don't close p[0] here or it might affect the result.

        resetSignalHandlersAndMask();

        struct sigaction action;
        action.sa_handler = _exit;
        action.sa_flags   = SA_RESTART;
        sigemptyset(&action.sa_mask);
        sigaction(SIGSEGV, &action, NULL);
        sigaction(SIGPIPE, &action, NULL);
        sigaction(SIGBUS, &action, NULL);
        sigaction(SIGILL, &action, NULL);
        sigaction(SIGFPE, &action, NULL);
        sigaction(SIGABRT, &action, NULL);

        DIR *dir = NULL;
        #ifdef __APPLE__
            /* /dev/fd can always be trusted on OS X. */
            dir = opendir("/dev/fd");
        #else
            /* On FreeBSD and possibly other operating systems, /dev/fd only
             * works if fdescfs is mounted. If it isn't mounted then /dev/fd
             * still exists but always returns [0, 1, 2] and thus can't be
             * trusted. If /dev and /dev/fd are on different filesystems
             * then that probably means fdescfs is mounted.
             */
            struct stat dirbuf1, dirbuf2;
            if (stat("/dev", &dirbuf1) == -1
             || stat("/dev/fd", &dirbuf2) == -1) {
                _exit(1);
            }
            if (dirbuf1.st_dev != dirbuf2.st_dev) {
                dir = opendir("/dev/fd");
            }
        #endif
        if (dir == NULL) {
            dir = opendir("/proc/self/fd");
            if (dir == NULL) {
                _exit(1);
            }
        }

        struct dirent *ent;
        union {
            int highest;
            char data[sizeof(int)];
        } u;
        u.highest = -1;

        while ((ent = readdir(dir)) != NULL) {
            if (ent->d_name[0] != '.') {
                int number = atoi(ent->d_name);
                if (number > u.highest) {
                    u.highest = number;
                }
            }
        }
        if (u.highest != -1) {
            ssize_t ret, written = 0;
            do {
                ret = write(p[1], u.data + written, sizeof(int) - written);
                if (ret == -1) {
                    _exit(1);
                }
                written += ret;
            } while (written < (ssize_t) sizeof(int));
        }
        closedir(dir);
        _exit(0);

    } else if (pid == -1) {
        goto done;

    } else {
        do {
            ret = close(p[1]);
        } while (ret == -1 && errno == EINTR);
        p[1] = -1;

        union {
            int highest;
            char data[sizeof(int)];
        } u;
        ssize_t ret, bytesRead = 0;
        struct pollfd pfd;
        pfd.fd = p[0];
        pfd.events = POLLIN;

        do {
            do {
                // The child process must finish within 30 ms, otherwise
                // we might as well query sysconf.
                ret = poll(&pfd, 1, 30);
            } while (ret == -1 && errno == EINTR);
            if (ret <= 0) {
                goto done;
            }

            do {
                ret = read(p[0], u.data + bytesRead, sizeof(int) - bytesRead);
            } while (ret == -1 && ret == EINTR);
            if (ret == -1) {
                if (errno != EAGAIN) {
                    goto done;
                }
            } else if (ret == 0) {
                goto done;
            } else {
                bytesRead += ret;
            }
        } while (bytesRead < (ssize_t) sizeof(int));

        result = u.highest;
        goto done;
    }

done:
    if (p[0] != -1) {
        do {
            ret = close(p[0]);
        } while (ret == -1 && errno == EINTR);
    }
    if (p[1] != -1) {
        do {
            close(p[1]);
        } while (ret == -1 && errno == EINTR);
    }
    if (pid != -1) {
        do {
            ret = kill(pid, SIGKILL);
        } while (ret == -1 && errno == EINTR);
        do {
            ret = waitpid(pid, NULL, 0);
        } while (ret == -1 && errno == EINTR);
    }

    if (result == -1) {
        result = getFileDescriptorLimit();
    }
    return result;
#endif
}

void
closeAllFileDescriptors(int lastToKeepOpen) {
    #if defined(F_CLOSEM)
        int ret;
        do {
            ret = fcntl(lastToKeepOpen + 1, F_CLOSEM);
        } while (ret == -1 && errno == EINTR);
        if (ret != -1) {
            return;
        }
    #elif defined(HAS_CLOSEFROM)
        closefrom(lastToKeepOpen + 1);
        return;
    #endif

    for (int i = getHighestFileDescriptor(); i > lastToKeepOpen; i--) {
        int ret;
        do {
            ret = close(i);
        } while (ret == -1 && errno == EINTR);
    }
}
Muscarine answered 22/5, 2009 at 17:36 Comment(0)
S
1

On MacOS, you can use posix_spawn with the Apple extension POSIX_SPAWN_CLOEXEC_DEFAULT set with posix_spawnattr_setflags.

This will leave only the file descriptors set up explicitly in the posix_spawn call open, closing call the others.

Scrawny answered 3/8, 2021 at 12:55 Comment(0)
B
0

Right when your program started and hasn't opened anything. E.g. like the start of main(). pipe and fork immediately starting an executer server. This way it's memory and other details is clean and you can just give it things to fork & exec.

#include <unistd.h>
#include <stdio.h>
#include <memory.h>
#include <stdlib.h>

struct PipeStreamHandles {
    /** Write to this */
    int output;
    /** Read from this */
    int input;

    /** true if this process is the child after a fork */
    bool isChild;
    pid_t childProcessId;
};

PipeStreamHandles forkFullDuplex(){
    int childInput[2];
    int childOutput[2];

    pipe(childInput);
    pipe(childOutput);

    pid_t pid = fork();
    PipeStreamHandles streams;
    if(pid == 0){
        // child
        close(childInput[1]);
        close(childOutput[0]);

        streams.output = childOutput[1];
        streams.input = childInput[0];
        streams.isChild = true;
        streams.childProcessId = getpid();
    } else {
        close(childInput[0]);
        close(childOutput[1]);

        streams.output = childInput[1];
        streams.input = childOutput[0];
        streams.isChild = false;
        streams.childProcessId = pid;
    }

    return streams;
}


struct ExecuteData {
    char command[2048];
    bool shouldExit;
};

ExecuteData getCommand() {
    // maybe use json or semething to read what to execute
    // environment if any and etc..        
    // you can read via stdin because of the dup setup we did
    // in setupExecutor
    ExecuteData data;
    memset(&data, 0, sizeof(data));
    data.shouldExit = fgets(data.command, 2047, stdin) == NULL;
    return data;
}

void executorServer(){

    while(true){
        printf("executor server waiting for command\n");
        // maybe use json or semething to read what to execute
        // environment if any and etc..        
        ExecuteData command = getCommand();
        // one way is for getCommand() to check if stdin is gone
        // that way you can set shouldExit to true
        if(command.shouldExit){
            break;
        }
        printf("executor server doing command %s", command.command);
        system(command.command);
        // free command resources.
    }
}

static PipeStreamHandles executorStreams;
void setupExecutor(){
    PipeStreamHandles handles = forkFullDuplex();

    if(handles.isChild){
        // This simplifies so we can just use standard IO 
        dup2(handles.input, 0);
        // we comment this out so we see output.
        // dup2(handles.output, 1);
        close(handles.input);
        // we uncomment this one so we can see hello world
        // if you want to capture the output you will want this.
        //close(handles.output);
        handles.input = 0;
        handles.output = 1;
        printf("started child\n");
        executorServer();
        printf("exiting executor\n");
        exit(0);
    }

    executorStreams = handles;
}

/** Only has 0, 1, 2 file descriptiors open */
pid_t cleanForkAndExecute(const char *command) {
    // You can do json and use a json parser might be better
    // so you can pass other data like environment perhaps.
    // and also be able to return details like new proccess id so you can
    // wait if it's done and ask other relevant questions.
    write(executorStreams.output, command, strlen(command));
    write(executorStreams.output, "\n", 1);
}

int main () {
    // needs to be done early so future fds do not get open
    setupExecutor();

    // run your program as usual.
    cleanForkAndExecute("echo hello world");
    sleep(3);
}

If you want to do IO on the executed program the executor server will have to do socket redirects and you can use unix sockets.

Bodi answered 11/6, 2016 at 22:2 Comment(0)
W
-2

Why don't you close all descriptors from 0 to, say, 10000.

It would be pretty fast, and the worst thing that would happen is EBADF.

Wunderlich answered 22/5, 2009 at 17:54 Comment(1)
Will work, but you will have to make that configurable, as you simply don't know how many need to be closed (depends on the load).Sambar

© 2022 - 2024 — McMap. All rights reserved.