waitpid for child process not succeeding
Asked Answered
H

1

5

I am starting a process using execv and letting it write to a file. I start a thread simultaneously that monitors the file so that it's size does not exceed a certain limit using stat.st_size. Now, when the limit is hit, I waitpid for the child process, but this throws an error and the process I start in the background becomes a zombie. When I do the stop using the same waitpid from the main thread, the process is killed without becoming a zombie. Any ideas?

Edit: The errno is 10 and waitpid returns -1. This is on a linux platform.

Hartmann answered 4/5, 2015 at 14:31 Comment(6)
"but this throws an error ..." and that error would be... what? include error codes and all related messaging verbatim in you posted question. May as well also include the platform info.Fairleigh
waitpid returned -1 with errno set to 10. The errno seems to indicate that the child process does not exist. But that does not seem to be the case since I am able to see the process with a ps ax. The OS is linux.Hartmann
And that would be in a comment. That info belongs in your posted question. Regardless, perhaps the section on "Linux Notes" in the documentation of waitpid may be related.Fairleigh
Thanks for pointing that out. So if I use the option _WALL, I should be able to wait on child processes created by the main thread. I edited the question to include the error information.Hartmann
A quick look at /usr/include/asm-generic/errno-base.h show 10 is ECHILD (no child process). (You can convert errno to a string with strerror_r(3)). Look at the waitpid(2) man page for more information.Interlunation
Also please note: "The exec() family of functions replaces the current process image with a new process image. " How are you starting the child process exactly?Perrotta
V
3

This is difficult to debug without code, but errno 10 is ECHILD.

Per the man page, this is returned as follows:

ECHILD (for waitpid() or waitid()) The process specified by pid (waitpid()) or idtype and id (waitid()) does not exist or is not a child of the calling process. (This can happen for one's own child if the action for SIGCHLD is set to SIG_IGN. See also the Linux Notes section about threads.)

In short, the pid you are specifying is not a child of the process calling waitpid() (or is no longer, perhaps because it has terminated).

Note the parenthetical section:

  • "This can happen for one's own child if the action for SIGCHLD is set to SIG_IGN" - if you've set up a signal handler for SIGCHLD to be SIG_IGN, the wait is effectively done automatically, and therefore waitpid won't work as the child will have already terminated (will not go through zombie state).

  • "See also the Linux Notes section about threads." - In Linux, threads are essentially processes. Modern linux will allow one thread to wait for children of other threads (provided they are in the same thread group - broadly parent process). If you are using Linux prior to 2.4, this is not the case. See the documentation on __WNOTHREAD for details.

I'm guessing the thread thing is a red herring, and the problem is actually the signal handler, as this accords with your statement 'the process is killed without becoming a zombie.'

Venosity answered 4/5, 2015 at 20:10 Comment(1)
I am not setting up a signal handler for SICHLD. So that should not be the issue. I am using a 2.6 kernel which should implicitly allow my thread to wait on the process created from the main thread. But that is not happening. I set the __WALL and __WCLONE options for waitpid to no avail. I will keep probing and get back to you. Thanks for the suggestions though.Hartmann

© 2022 - 2024 — McMap. All rights reserved.