Having trouble with fork(), pipe(), dup2() and exec() in C
Asked Answered
T

6

9

Here's my code:

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <wait.h>
#include <readline/readline.h>

#define NUMPIPES 2

int main(int argc, char *argv[]) {
    char *bBuffer, *sPtr, *aPtr = NULL, *pipeComms[NUMPIPES], *cmdArgs[10];
    int fdPipe[2], pCount, aCount, i, status, lPids[NUMPIPES];
    pid_t pid;

    pipe(fdPipe);

    while(1) {
        bBuffer = readline("Shell> ");

        if(!strcasecmp(bBuffer, "exit")) {
            return 0;
        }

        sPtr = bBuffer;
        pCount = -1;

        do {
            aPtr = strsep(&sPtr, "|");
            pipeComms[++pCount] = aPtr;
        } while(aPtr);

        for(i = 0; i < pCount; i++) {
            aCount = -1;

            do {
                aPtr = strsep(&pipeComms[i], " ");
                cmdArgs[++aCount] = aPtr;
            } while(aPtr);

            cmdArgs[aCount] = 0;

            if(strlen(cmdArgs[0]) > 0) {
                pid = fork();

                if(pid == 0) {
                    if(i == 0) {
                        close(fdPipe[0]);

                        dup2(fdPipe[1], STDOUT_FILENO);

                        close(fdPipe[1]);
                    } else if(i == 1) {
                        close(fdPipe[1]);

                        dup2(fdPipe[0], STDIN_FILENO);

                        close(fdPipe[0]);
                    }

                    execvp(cmdArgs[0], cmdArgs);
                    exit(1);
                } else {
                    lPids[i] = pid;

                    /*waitpid(pid, &status, 0);

                    if(WIFEXITED(status)) {
                        printf("[%d] TERMINATED (Status: %d)\n",
                            pid, WEXITSTATUS(status));
                    }*/
                }
            }
        }

        for(i = 0; i < pCount; i++) {
            waitpid(lPids[i], &status, 0);

            if(WIFEXITED(status)) {
                printf("[%d] TERMINATED (Status: %d)\n",
                    lPids[i], WEXITSTATUS(status));
            }
        }
    }

    return 0;
}

(The code was updated to reflect he changes proposed by two answers below, it still doesn't work as it should...)

Here's the test case where this fails:

nazgulled ~/Projects/SO/G08 $ ls -l
total 8
-rwxr-xr-x 1 nazgulled nazgulled  7181 2009-05-27 17:44 a.out
-rwxr-xr-x 1 nazgulled nazgulled   754 2009-05-27 01:42 data.h
-rwxr-xr-x 1 nazgulled nazgulled  1305 2009-05-27 17:50 main.c
-rwxr-xr-x 1 nazgulled nazgulled   320 2009-05-27 01:42 makefile
-rwxr-xr-x 1 nazgulled nazgulled 14408 2009-05-27 17:21 prog
-rwxr-xr-x 1 nazgulled nazgulled  9276 2009-05-27 17:21 prog.c
-rwxr-xr-x 1 nazgulled nazgulled 10496 2009-05-27 17:21 prog.o
-rwxr-xr-x 1 nazgulled nazgulled    16 2009-05-27 17:19 test
nazgulled ~/Projects/SO/G08 $ ./a.out 
Shell> ls -l|grep prog
[4804] TERMINATED (Status: 0)
-rwxr-xr-x 1 nazgulled nazgulled 14408 2009-05-27 17:21 prog
-rwxr-xr-x 1 nazgulled nazgulled  9276 2009-05-27 17:21 prog.c
-rwxr-xr-x 1 nazgulled nazgulled 10496 2009-05-27 17:21 prog.o

The problem is that I should return to my shell after that, I should see "Shell> " waiting for more input. You can also notice that you don't see a message similar to "[4804] TERMINATED (Status: 0)" (but with a different pid), which means the second process didn't terminate.

I think it has something to do with grep, because this works:

nazgulled ~/Projects/SO/G08 $ ./a.out 
Shell> echo q|sudo fdisk /dev/sda
[4838] TERMINATED (Status: 0)

The number of cylinders for this disk is set to 1305.
There is nothing wrong with that, but this is larger than 1024,
and could in certain setups cause problems with:
1) software that runs at boot time (e.g., old versions of LILO)
2) booting and partitioning software from other OSs
   (e.g., DOS FDISK, OS/2 FDISK)

Command (m for help): 
[4839] TERMINATED (Status: 0)

You can easily see two "terminate" messages...

So, what's wrong with my code?

Tepid answered 27/5, 2009 at 17:0 Comment(5)
It looks like grep didn't reach the end. It failed to print out the third matching line ("prog.o").Adham
It did print the third matching line, I just didn't copy the whole thing correctly. I've already edited the post to fix that.Tepid
I have a tiny (999LOC) shell example written for a university assignment years ago. patch-tag.com/r/xsh In particular, job.c#job_run and process.c#process_run contain the pipeline setup, and main.c#waitpid_wrapper contains the wait-handling.Recrudescence
Thanks, but I rather keep trying to do it myself and post my issues when I have them, I learn more and better that way. Also, your code is way more than I need to know for the time being.Tepid
My code is a lot less to read than Bash :) You'll probably have to figure out how to deal with setpgid and tcsetpgrp eventually, and all the other fun that comes with job control and the historical design of UNIX sessions and terminal process groups...Recrudescence
R
33

Even after the first command of your pipeline exits (and thust closes stdout=~fdPipe[1]), the parent still has fdPipe[1] open.

Thus, the second command of the pipeline has a stdin=~fdPipe[0] that never gets an EOF, because the other endpoint of the pipe is still open.

You need to create a new pipe(fdPipe) for each |, and make sure to close both endpoints in the parent; i.e.

for cmd in cmds
    if there is a next cmd
        pipe(new_fds)
    fork
    if child
        if there is a previous cmd
            dup2(old_fds[0], 0)
            close(old_fds[0])
            close(old_fds[1])
        if there is a next cmd
            close(new_fds[0])
            dup2(new_fds[1], 1)
            close(new_fds[1])
        exec cmd || die
    else
        if there is a previous cmd
            close(old_fds[0])
            close(old_fds[1])
        if there is a next cmd
            old_fds = new_fds
if there are multiple cmds
    close(old_fds[0])
    close(old_fds[1])

Also, to be safer, you should handle the case of fdPipe and {STDIN_FILENO,STDOUT_FILENO} overlapping before performing any of the close and dup2 operations. This may happen if somebody has managed to start your shell with stdin or stdout closed, and will result in great confusion with the code here.

Edit

   fdPipe1           fdPipe3
      v                 v
cmd1  |  cmd2  |  cmd3  |  cmd4  |  cmd5
               ^                 ^
            fdPipe2           fdPipe4

In addition to making sure you close the pipe's endpoints in the parent, I was trying to make the point that fdPipe1, fdPipe2, etc. cannot be the same pipe().

/* suppose stdin and stdout have been closed...
 * for example, if your program was started with "./a.out <&- >&-" */
close(0), close(1);

/* then the result you get back from pipe() is {0, 1} or {1, 0}, since
 * fd numbers are always allocated from the lowest available */
pipe(fdPipe);

close(0);
dup2(fdPipe[0], 0);

I know you don't use close(0) in your present code, but the last paragraph is warning you to watch out for this case.

Edit

The following minimal change to your code makes it work in the specific failing case you mentioned:

@@ -12,6 +12,4 @@
     pid_t pid;

-    pipe(fdPipe);
-
     while(1) {
         bBuffer = readline("Shell> ");
@@ -29,4 +27,6 @@
         } while(aPtr);

+        pipe(fdPipe);
+
         for(i = 0; i < pCount; i++) {
                 aCount = -1;
@@ -72,4 +72,7 @@
         }

+        close(fdPipe[0]);
+        close(fdPipe[1]);
+
         for(i = 0; i < pCount; i++) {
                 waitpid(lPids[i], &status, 0);

This won't work for more than one command in the pipeline; for that, you'd need something like this: (untested, as you have to fix other things as well)

@@ -9,9 +9,7 @@
 int main(int argc, char *argv[]) {
     char *bBuffer, *sPtr, *aPtr = NULL, *pipeComms[NUMPIPES], *cmdArgs[10];
-    int fdPipe[2], pCount, aCount, i, status, lPids[NUMPIPES];
+    int fdPipe[2], fdPipe2[2], pCount, aCount, i, status, lPids[NUMPIPES];
     pid_t pid;

-    pipe(fdPipe);
-
     while(1) {
         bBuffer = readline("Shell> ");
@@ -32,4 +30,7 @@
                 aCount = -1;

+                if (i + 1 < pCount)
+                    pipe(fdPipe2);
+
                 do {
                         aPtr = strsep(&pipeComms[i], " ");
@@ -43,11 +44,12 @@

                         if(pid == 0) {
-                                if(i == 0) {
-                                        close(fdPipe[0]);
+                                if(i + 1 < pCount) {
+                                        close(fdPipe2[0]);

-                                        dup2(fdPipe[1], STDOUT_FILENO);
+                                        dup2(fdPipe2[1], STDOUT_FILENO);

-                                        close(fdPipe[1]);
-                                } else if(i == 1) {
+                                        close(fdPipe2[1]);
+                                }
+                                if(i != 0) {
                                         close(fdPipe[1]);

@@ -70,4 +72,17 @@
                         }
                 }
+
+                if (i != 0) {
+                    close(fdPipe[0]);
+                    close(fdPipe[1]);
+                }
+
+                fdPipe[0] = fdPipe2[0];
+                fdPipe[1] = fdPipe2[1];
+        }
+
+        if (pCount) {
+            close(fdPipe[0]);
+            close(fdPipe[1]);
         }
Recrudescence answered 27/5, 2009 at 19:53 Comment(18)
I see what you mean, but I'm finding it rather difficult to understand that pseudo code and everything it's doing. That old and new fds pipe thing is confusing me. Also, not sure what you mean by the last paragraph.Tepid
Ignoring the old_fds/new_fds thing, if I change your code to run pipe() within the readline loop and close(fdPipe[]) before waiting, it works in the immediate sense that the failing pipeline in the question no longer fails. The rest is cautionary, as *other things that might be wrong as you go further.Recrudescence
I see that you probably didn't notice in the code but I was just considering the case with 2 commands and one pipe would suffice. Still, I didn't know I needed more than one for more than 2 commands so that will come in handy. But for now, I'll concentrate is just two, try to understand that and move to the next step. I still don't quite get everything you said besides that and that I need to close them in the parents. I think it's better I fix this for two commends and then post back my code and if I'll be doing something wrong or missing something, hopefully, you'll point it out.Tepid
One last thing, I'm a little ahead of what we are supposed to do, this exercise doesn't come til next week's Friday and cause I have other classes to study, I might not post here for a few days. If you don't mind in keeping an eye (maybe through rss), I will be much appreciated. (P.S: I'll probably mark this answer as the accepted one, I just want to make sure that I get it and can fix my code before I do)Tepid
This is the best answer to accept - I agree. And the generalization to multi-stage pipelines will be important eventually. I believe that some shells - probably bash amongst them - fork off a single process to run the entire pipeline, and that first child eventually execs the last process in the pipeline, after fixing up the plumbing between the various other stages. The child also becomes a process group leader and other things so that job control works sanely.Tova
I think I'll be asking the teacher about that, if I should spawn a new process for each command in the pipeline or just one...Tepid
By the way, how does bash fork only one single process to run the entire pipeline? I can't get my head around how is that possible...Tepid
So the suggestion is to close the ends of the pipe in the first process after the loop that does the forking, and before the loop that waits. These open ends are preventing the ctrl-D (EOF) that grep needs to finish. The danger ephemient mentions would occur if you were to have more than two commands, so more than one pipe. If you closed one pipe before creating the next one, the same fds would be re-used, causing problems.Adham
Jonathan thinks that Bash perhaps first forks, then from within that child, sets up all the pipes and forks/execs all the commands in the pipeline, exec'ing the last command and setting it as the group leader. While this seems reasonable, I can confirm that Bash 3.2 and 4.0 do not do this -- they pipe and fork from the parent, and uses the first command of the pipeline as the group leader. Incidentally, the simple xsh I left in a comment to the question does this as well.Recrudescence
@uncleo, it's okay for the pipe FDs to become re-used after you've closed them, as they'll refer to different pipes. The warning is to check before mangling the pipe FDs in case they overlap with the "standard range" of FDs.Recrudescence
@ephemient: you looked at the code - I bow to your superior knowledge. I'll have to see how the children are associated with the group leader, sometime, when I have time. My main points - that your answer was more on target than mine (even though I made one valid point), and that job control requires care - remain valid.Tova
After a while, I'm still finding the algoritm rather confusing to apply to my code... I honestly look at the algorithm and can't possibly understand how is that going to work the way I want it.Tepid
@Recrudescence I tried your pseudocode and it works for size == 2, but for size > 2 it doesn't work. any idea why?Billfish
@Mariska, I can only guess without more info. One possible failure is if you're writing in C, literally writing old_fds = new_fds won't work.Recrudescence
@Recrudescence Yes, I handled the old_fds[0] = new_fds[0],etc. I have my sample short code in C, but I'm not sure how I can share with you to take a look?Billfish
@Mariska, you can ask a question.Recrudescence
Didn't get it why do we need to have the last if in the pseudo code. Say we have multiple cmds and in the last iteration of the for we will skip the new_fds part but closing old_fds won't be skipped. So in what situation we will have old_fds opened after the for block?Citystate
And any reason for doing dup2, close, close vs close, dup2, close in the child process? Thx.Citystate
T
3

You should have an error exit after execvp() - it will fail sometime.

exit(EXIT_FAILURE);

As @uncleo points out, the argument list must have a null pointer to indicate the end:

cmdArgs[aCount] = 0;

It is not clear to me that you let both programs run free - it appears that you require the first program in the pipeline to finish before starting the second, which is not a recipe for success if the first program blocks because the pipe is full.

Tova answered 27/5, 2009 at 17:48 Comment(7)
I didn't add any type of error handling on purpose, no need to make the code even bigger for testing... Like I said above, doesn't strsep() handle that? The last argument will always be NULL cause strsep() does that himself.Tepid
He's right about the running free. You fork off a process and wait for it to finish before forking off the next one. Perhaps grep is stalling with a filled pipe.Adham
None of those suggestions solved the problem... I just tested everything and the same thing happens.Tepid
@Nazgulled: I understand no error handling - but I'd still put an exit() after execvp(). Maybe you're safe enough - that makes sure. Also, your 'ls' listing is short enough that it shouldn't fill any pipe buffer - fortunately, you're testing in a small environment. I see that strsep() will give you a null - you're using do { } while ().Tova
Be aware that strsep() is not available everywhere - specifically, not on Solaris 10.Tova
I just updated the code in the question to reflect the changes suggested. I understand what you're saying Jonathan, the full code for my bash handles all errors, including the ones from execvp(), this is just a test example so I can understand pipes and make them work so then I can adapt the code to my bash. Also, strsep() is available for me, which is enough. This is just a university exercise, not a real a and live program.Tepid
@Nazgulled: Don't write "cheap code" just because it's an "exercise". You would much rather build a habit of writing good code than try to break a habit of writing bad code.Recrudescence
A
1

Jonathan has the right idea. You rely on the first process to fork all the others. Each one has to run to completion before the next one is forked.

Instead, fork the processes in a loop like you are doing, but wait for them outside the inner loop, (at the bottom of the big loop for the shell prompt).

loop //for prompt
    next prompt
    loop //to fork tasks, store the pids
        if pid == 0 run command
        else store the pid
    end loop
    loop // on pids
        wait
    end loop
end loop
Adham answered 27/5, 2009 at 18:11 Comment(2)
Didn't you just said that "I" was right in the comment above? It didn't felt you were talking to me saying that Jonathan was right but the other way around. Maybe I misinterpreted you...Tepid
I just updated the code in the question to reflect the changes suggested.Tepid
C
0

I think your forked processes will continue executing.

Try either:

  • Changing it to 'return execvp'
  • Add 'exit(1);' after execvp
Chrysoberyl answered 27/5, 2009 at 17:24 Comment(7)
Hmm maybe not. But try it anyway =) Maybe make it 'return execvp'Chrysoberyl
I don't think that will work. As far as I know execvp() replaces the current process (child process) by the executing command and so, if the execvp() call is successful, the child process will never reach any code line below that.Tepid
Just tried and as I suspected, doesn't work: 1) "The exec() family of functions replaces the current process image with a new process image." 2) "If any of the exec() functions returns, an error will have occurred." Quotes from man page.Tepid
Yeah, that's what I was thinking... but I did a search for execvp examples, and they all seem to have an 'exit(1);' afterwards.Chrysoberyl
Maybe it pops it off when it's finished executing and resumes your previous process? It just executes it without spawning a new process, but resumes where it leaves off?Chrysoberyl
Then they don't know how exec() works lol... I also didn't and it was here on SO that someone pointed it out to me.Tepid
exec*() never returns if it was successful, but will return if it failed. Thus a exit(1) following an exec*() is to handle failure (for whatever reason... file not found, no permission, etc.)Recrudescence
A
0

One potential problem is that cmdargs may have garbage at the end of it. You're supposed to terminate that array with a null pointer before passing it to execvp().

It looks like grep is accepting STDIN, though, so that might not be causing any problems (yet).

Adham answered 27/5, 2009 at 17:43 Comment(1)
I believe strsep() deals with that, let's say the command is "cmd arg1 arg2", cmdArgs will be: cmdArgs[0] = "cmd"; cmdArgs[1] = "arg1"; cmdArgs[2] = "arg2"; cmdArgs[3] = NULL;Tepid
C
0

the file descriptors from the pipe are reference counted, and incremented with each fork. for every fork, you have to issue a close on both descriptors in order to reduce the reference count to zero and allow the pipe to close. I'm guessing.

Contrivance answered 26/8, 2010 at 12:32 Comment(1)
and also close in the parent process to reduce the reference count to zero, sorry.Contrivance

© 2022 - 2024 — McMap. All rights reserved.