I have written a simple C program in RedHat Linux which waits for a child process using waitpid after calling execv.
int main( int argc, char * argv[] )
{
int pid;
int status = 0;
int wait_ret;
const char * process_path = argv[1];
if ( argc < 2 )
{
exit( EXIT_FAILURE );
}
pid = fork(); //spawn child process
if ( 0 == pid ) //child
{
int ret = execv( process_path, &argv[1] );
if ( ret )
{
printf( "execv failed: %s\n", strerror( errno ) );
}
exit( EXIT_SUCCESS );
}
//wait for the child to terminate
wait_ret = waitpid( pid, &status, WUNTRACED );
if ( -1 == wait_ret )
{
printf( "ERROR: Failed to wait for process termination\n" );
exit( EXIT_FAILURE );
}
// ... handlers for child exit status ...
return 0;
}
I am using this as a simple watchdog for some processes I am runnning.
My problem is that one process in particular is not being reaped by waitpid upon exiting and instead remains forever in a Zombie state while waitpid is hung. I am not sure why waitpid is unable to reap this process once it becomes a Zombie (maybe a leaked file descriptor or something).
I could use the WNOHANG flag and poll the child's stat proc file to check for the Zombie state but I would prefer a more elegant solution. Maybe there is some function that I could use to get the Zombie status from without polling this file?
Does anyone know an alternative to waitpid which WILL return when the process becomes a Zombie?
Additional Information:
The child process is being closed by a call to exit( EXIT_FAILURE);
in one of its threads.
cat /proc/<CHILD_PID>/stat
(before exit):
1037 (my_program) S 1035 58 58 0 -1 4194560 1309 0 22 0 445 1749 0 0 20 0 13 0 4399 22347776 1136 4294967295 3336716288 3338455332 3472776112 3472775232 3335760920 0 0 4 31850 4294967295 0 0 17 0 0 0 26 0 0 3338489412 3338507560 3338600448
cat /proc/<CHILD_PID>/stat
(after exit):
1037 (my_program) Z 1035 58 58 0 -1 4227340 1316 0 22 0 464 1834 0 0 20 0 2 0 4399 0 0 4294967295 0 0 0 0 0 0 0 4 31850 4294967295 0 0 17 0 0 0 26 0 0 0 0 0
Note that the child PID is 1037 and the parent PID is 1035 in this case.
waitpid()
? – Urinastrace
) to see what it is doing to get into this persistent zombie state? Have you examined its/proc
data; have you runlsof
on it? If you kill your watchdog process, what happens to the zombie. … If it remains a zombie after its parent is gone, this is a Unix question. – Teleutosporewaitpid
doesn’t work). Unfortunately nothing is jumping out at me as a reason or explanation. (2) I’m just realizing that I may have partially misread the question. Are you asking how to reap this zombie, or how to detect that the process has become a zombie? (3) Try catching SIGCHLD. … … … … … … … … … … … … … … … … Good luck. – Teleutospore