Tracking the death of a child process
Asked Answered
C

4

25

How could I track down the death of a child process without making the parent process wait until the child process got killed?

I am trying a client-server scenario where the server accepts the connection from a client and forks a new process for each and every connection it accepts.

I am ignoring SIGCHLD signals to prevent zombie creation.

signal(SIGCHLD, SIG_IGN);
while(1)
{
  accept();
  clients++;
  if(fork() ==0)
  {
     childfunction();
     clients--;
  }
  else
  {
  }
}

The problem in the above scenario is that if the child process gets killed in the childfunction() function, the global variable clients is not getting decremented.

NOTE: I am looking for a solution without using SIGCHLD signal ... If possible

Cyclamate answered 4/3, 2010 at 8:32 Comment(3)
You can do something in the signal handler of SIGCHLDDescender
I already mentioned ... SIGCHLD signal is ignored .. ?Cyclamate
+1 for an insanely dramatic title and opening sentence. D:Crematorium
D
28

Typically you write a handler for SIGCHLD which calls waitpid() on pid -1. You can use the return value from that to determine what pid died. For example:

void my_sigchld_handler(int sig)
{
    pid_t p;
    int status;

    while ((p=waitpid(-1, &status, WNOHANG)) != -1)
    {
       /* Handle the death of pid p */
    }
}

/* It's better to use sigaction() over signal().  You won't run into the
 * issue where BSD signal() acts one way and Linux or SysV acts another. */

struct sigaction sa;

memset(&sa, 0, sizeof(sa));
sa.sa_handler = my_sigchld_handler;

sigaction(SIGCHLD, &sa, NULL);

Alternatively you can call waitpid(pid, &status, 0) with the child's process ID specified, and synchronously wait for it to die. Or use WNOHANG to check its status without blocking.

Deming answered 4/3, 2010 at 9:19 Comment(7)
But in my example SIGCHLD is IGNORED ??Cyclamate
@Cyclamate And I'm suggesting you might want to re-evaluate that. You don't need to ignore it to avoid zombies. When you waitpid() the "zombie" goes away.Deming
@Deming - I am trying to create a daemon process .. and using the SIGCHLD handler broken the stability of the application .. All the time parent process should be waiting down in SIGCHLD handler ..Cyclamate
@Cyclamate If it is causing you instability, perhaps you are doing something unsafe in the handler. You have to be careful writing signal handlers because they can be called in the middle of your program where your invariants might be temporary broken. Check out a Google search for "posix signal safety" or something.Deming
This is precisely what the SIGCHLD signal is for.Seamstress
@Deming - Tried using the SIGCHLD in my application and it is not cleaning the zombies properly. .. Just check my question as I updated with my sample app.Cyclamate
@codingfreak: That if in your chld_handler needs to be a while ((p = waitpid(-1, &status, WNOHANG)) > 0) (Because if two or more children exit quickly in succession, you might only get one signal).Seamstress
C
9

None of the solutions so far offer an approach without using SIGCHLD as the question requests. Here is an implementation of an alternative approach using poll as outlined in this answer (which also explains why you should avoid using SIGCHLD in situations like this):

Make sure you have a pipe to/from each child process you create. It can be either their stdin/stdout/stderr or just an extra dummy fd. When the child process terminates, its end of the pipe will be closed, and your main event loop will detect the activity on that file descriptor. From the fact that it closed, you recognize that the child process died, and call waitpid to reap the zombie.

(Note: I omitted some best practices like error-checking and cleaning up file descriptors for brevity)

/**
 * Specifies the maximum number of clients to keep track of.
 */
#define MAX_CLIENT_COUNT 1000

/**
 * Tracks clients by storing their process IDs and pipe file descriptors.
 */
struct process_table {
    pid_t clientpids[MAX_CLIENT_COUNT];
    struct pollfd clientfds[MAX_CLIENT_COUNT];
} PT;

/**
 * Initializes the process table. -1 means the entry in the table is available.
 */
void initialize_table() {
    for (int i = 0; i < MAX_CLIENT_COUNT; i++) {
        PT.clientfds[i].fd = -1;
    }
}

/**
 * Returns the index of the next available entry in the process table.
 */
int get_next_available_entry() {
    for (int i = 0; i < MAX_CLIENT_COUNT; i++) {
        if (PT.clientfds[i].fd == -1) {
            return i;
        }
    }
    return -1;
}

/**
 * Adds information about a new client to the process table.
 */
void add_process_to_table(int i, pid_t pid, int fd) {
    PT.clientpids[i] = pid;
    PT.clientfds[i].fd = fd;
}

/**
 * Removes information about a client from the process table.
 */
void remove_process_from_table(int i) {
    PT.clientfds[i].fd = -1;
}

/**
 * Cleans up any dead child processes from the process table.
 */
void reap_zombie_processes() {
    int p = poll(PT.clientfds, MAX_CLIENT_COUNT, 0);

    if (p > 0) {
        for (int i = 0; i < MAX_CLIENT_COUNT; i++) {
            /* Has the pipe closed? */
            if ((PT.clientfds[i].revents & POLLHUP) != 0) {
                // printf("[%d] done\n", PT.clientpids[i]);
                waitpid(PT.clientpids[i], NULL, 0);
                remove_process_from_table(i);
            }
        }
    }
}

/**
 * Simulates waiting for a new client to connect.
 */
void accept() {
    sleep((rand() % 4) + 1);
}

/**
 * Simulates useful work being done by the child process, then exiting.
 */
void childfunction() {
    sleep((rand() % 10) + 1);
    exit(0);
}

/**
 * Main program
 */
int main() {
    /* Initialize the process table */
    initialize_table();

    while (1) {
        accept();

        /* Create the pipe */
        int p[2];
        pipe(p);

        /* Fork off a child process. */
        pid_t cpid = fork();

        if (cpid == 0) {
            /* Child process */
            close(p[0]);
            childfunction();
        }
        else {
            /* Parent process */
            close(p[1]);
            int i = get_next_available_entry();
            add_process_to_table(i, cpid, p[0]);
            // printf("[%d] started\n", cpid);
            reap_zombie_processes();
        }
    }

    return 0;
}

And here is some sample output from running the program with the printf statements uncommented:

[31066] started
[31067] started
[31068] started
[31069] started
[31066] done
[31070] started
[31067] done
[31068] done
[31071] started
[31069] done
[31072] started
[31070] done
[31073] started
[31074] started
[31072] done
[31075] started
[31071] done
[31074] done
[31081] started
[31075] done
Coneflower answered 6/2, 2016 at 0:47 Comment(0)
I
3

You don't want a zombie. If a child process dies and the parent is still RUNNING but never issues a wait()/waitpid() call to harvest the status, the system does not release the resources associated with the child and a zombie/defunct process is left in the proc table.

Try changing your SIGCHLD handler to something closer to the following:


void chld_handler(int sig) {
    pid_t p;
    int status;

    /* loop as long as there are children to process */
    while (1) {

       /* retrieve child process ID (if any) */
       p = waitpid(-1, &status, WNOHANG);

       /* check for conditions causing the loop to terminate */
       if (p == -1) {
           /* continue on interruption (EINTR) */
           if (errno == EINTR) {
               continue;
           }
           /* break on anything else (EINVAL or ECHILD according to manpage) */
           break;
       }
       else if (p == 0) {
           /* no more children to process, so break */
           break;
       }

       /* valid child process ID retrieved, process accordingly */
       ...
    }   
}

You could optionally mask/block additional SIGCHLD signals during execution of the signal handler using sigprocmask(). The blocked mask must be returned to its original value when the signal handling routine has finished.

If you really don't want to use a SIGCHLD handler, you could try adding the child processing loop somewhere where it would be called regularly and poll for terminated children.

Infantryman answered 4/3, 2010 at 15:46 Comment(2)
After changes made as you said .. I dont see any zombies created .. Let me see how it behaves in HTTP server daemon ...Cyclamate
I don't anticipate any major problems with using this code in your daemon.Infantryman
L
1

The variable 'clients' are in different process address spaces after fork() and when you decrement the variable in the child, this will not affect the value in the parent. I think you need to handle SIGCHLD to handle the count correctly.

Lye answered 25/6, 2010 at 12:51 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.