Parallel processing in shell scripting, 'pid is not a child of this shell'
Asked Answered
C

3

2

I have a question about parallel processing in shell scripting. I have a program my Program, which I wish to run multiple times, in a loop within a loop. This program is basically this:

MYPATHDIR=`ls $MYPATH`
for SUBDIRS in $MYPATHDIR; do
  SUBDIR_FILES=`ls $MYPATH/$SUBDIRS`
  for SUBSUBDIRS in $SUBDIR_FILES; do
    find $MYPATH/$SUBDIRS/$SUBSUBDIRS | ./myProgram $MYPATH/$SUBDIRS/outputfile.dat
  done
done

What I wish to do is to take advantage of parallel processing. So I tried this for the middle line to start all the myPrograms at once:

(find $MYPATH/$SUBDIRS/$SUBSUBDIRS | ./myProgram $MYPATH/$SUBDIRS/outputfile.dat &)

However, this began all 300 or so calls to myProgram simultaneously, causing RAM issues etc.

What I would like to do is to run each occurrence of myProgram in the inner loop in parallel, but wait for all of these to finish before moving on to the next outer loop iteration. Based on the answers to this question, I tried the following:

for SUBDIRS in $MYPATHDIR; do
  SUBDIR_FILES=`ls $MYPATH/$SUBDIRS`
  for SUBSUBDIRS in $SUBDIR_FILES; do
    (find $MYPATH/$SUBDIRS/$SUBSUBDIRS | ./myProgram $MYPATH/$SUBDIRS/outputfile.dat &)
  done
  wait $(pgrep myProgram)   
done

But I got the following warning/error, repeated multiple times:

./myScript.sh: line 30: wait: pid 1133 is not a child of this shell

...and all the myPrograms were started at once, as before.

What am I doing wrong? What can I do to achieve my aims? Thanks.

Christo answered 7/11, 2011 at 17:46 Comment(3)
() invokes a subshell, which then invokes find/myprogram, so you're dealing with "grandchildren" processes. You can't wait on grandchildren, only direct descendants (aka children).Fiorin
I see. Can I alter my code to make them children instead of grandchildren?Christo
Right I've taken out the brackets and it seems to be working. Megaparty, post your comment as an answer and you're in the money.Christo
F
4

() invokes a subshell, which then invokes find/myprogram, so you're dealing with "grandchildren" processes. You can't wait on grandchildren, only direct descendants (aka children).

Fiorin answered 7/11, 2011 at 18:0 Comment(0)
C
2

You may find GNU Parallel useful.

parallel -j+0 ./myProgram ::: $MYPATH/$SUBDIRS/*

This will run as many as ./myProgram as CPU cores in parallel.

Chesterchesterfield answered 8/11, 2011 at 2:41 Comment(0)
S
1

to wait for a non-child process, you can watch the proc filesystem

while [ -e /proc/$pid ]; do sleep 1; done

this can produce false positives if the pid process terminates
and another process immediately takes the same pid

fix: also check the process start time

_wait() {
  # wait for non-child process
  local pid=$1
  # process start time
  local pst=$(stat -c%X /proc/$pid 2>/dev/null || true)
  [ -z "$pst" ] && return
  while [ "$(stat -c%X /proc/$pid 2>/dev/null || true)" == $pst ]; do sleep 1; done
}

_wait 12345
Stolzer answered 3/8 at 10:30 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.