File Descriptors
Every file, socket, pipes, etc... is uniquely identified within your process by a number, called the file descriptor.
If you create a new file descriptor you will get the lowest unused file descriptor number in your process, starting at 0.
The first 3 file descriptors of each have a special role:
FD |
C Constant |
0 |
STDIN_FILENO |
1 |
STDOUT_FILENO |
2 |
STDERR_FILENO |
If you want you can always take a look at the file descriptors (and what they are referring to) by querying /proc
, e.g.:
ls -l /proc/<pid of your process>/fd
execve & its friends
execve
replaces the current process with a new one, as specified by the arguments.
All file descriptors that your process had open will remain open¹ and the new process can use them.
¹ except those marked as close-on-exec
What your program does
Directly after your program starts, your file descriptors could look like this:
0 -> /dev/pts/1
1 -> /dev/pts/1
2 -> /dev/pts/1
(just the normal stdin, stdout, stderr, connected to a normal terminal)
after that you allocate a socket:
int sock = socket(AF_INET, SOCK_STREAM, 0);
0 -> /dev/pts/1
1 -> /dev/pts/1
2 -> /dev/pts/1
3 -> [socket:12345]
then you connect the socket and get to the dup2's.
dup2
clones a file descriptor and - unlike dup
- assigns it a specific file descriptor number (if that fd is already in use it will be closed first)
so after dup2(sock, STDIN_FILENO);
your fd's would look like this:
0 -> [socket:12345]
1 -> /dev/pts/1
2 -> /dev/pts/1
3 -> [socket:12345]
so before the execl
the fd's would be:
0 -> [socket:12345]
1 -> [socket:12345]
2 -> [socket:12345]
3 -> [socket:12345]
Then your process execs to /bin/sh
, replacing the current process with a shell.
So now you have a shell with its input and output hooked up to the socket you created, effectively allowing the program on the other end of the socket to send arbitrary shell commands which will be executed by /bin/sh
and the output returned via the socket.
As @JonathanLeffler has pointed out in the comments, the fd 3 could be closed before the exec, because it's not needed.
Why not use dup
instead of dup2
?
Using dup
, like you quoted, will give you the lowest available fd that's available in your process.
So it would be possible to do the following:
close(STDIN_FILENO);
close(STDOUT_FILENO);
close(STDERR_FILENO);
dup(sock);
dup(sock);
dup(sock);
The closes would close fd 0-2:
3 -> [socket:12345]
and the dup would duplicate fd 3 to 0-2 (you always get the lowest available number, even if those would be stdin, stdout or stderr)
0 -> [socket:12345]
1 -> [socket:12345]
2 -> [socket:12345]
3 -> [socket:12345]
However, this could potentially go wrong if you have other threads that are creating fd's (e.g. another thread might just be creating a new fd after you closed stdin, so it gets fd 0, and your dup() later would get 4).
So that's what dup2()
is about: precisely assigning a specific fd (in this case stdin, stdout, stderr).
The dup2() system call performs the same task as dup(), but
instead of using the lowest-numbered unused file descriptor, it
uses the file descriptor number specified in newfd. In other
words, the file descriptor newfd is adjusted so that it now
refers to the same open file description as oldfd.
There's also dup3
, which in addition to what dup2
can do additionally allows you to specify flags, e.g. O_CLOEXEC
, that would automatically close the fd when execing.
dup()
, notdup2()
. There's a bug in the code — it should haveclose(sock);
orif (sock > STDERR_FILENO) close(sock);
so that the socket file descriptor is closed — even though there are copies of the file descriptor on the standard I/O file descriptors. – Traumatism