How to limit number of sub-processes used in a function
Asked Answered
I

8

29

My question is how change this code so it will use only 4 sub-processes?

TESTS="a b c d e"

for f in $TESTS; do
  t=$[ ( $RANDOM % 5 )  + 1 ]
  sleep $t && echo $f $t &
done
wait
Ison answered 28/6, 2011 at 19:13 Comment(2)
These are not threads, but sub-processes.Spiffing
Could you please consider accepting an answer and changing "threads" to "sub-processes" in your question? Using the wrong words makes it harder to find on the web! ;-)Ammoniate
F
44

Interesting question. I tried to use xargs for this and I found a way.

Try this:

seq 10 | xargs -i --max-procs=4 bash -c "echo start {}; sleep 3; echo done {}"

--max-procs=4 will ensure that no more than four subprocesses are running at a time.

The output will look like this:

start 2
start 3
start 1
start 4
done 2
done 3
done 1
done 4
start 6
start 5
start 7
start 8
done 6
done 5
start 9
done 8
done 7
start 10
done 9
done 10

Note that the order of execution might not follow the commands in the order you submit them. As you can see 2 started before 1.

Fibrovascular answered 28/6, 2011 at 21:24 Comment(3)
BusyBox xargs does not have a --max-procs option though.Alanalana
@Alanalana It seems like -P would do it. Try this: docker run --rm busybox sh -c "seq 10 | xargs -I'{}' -P 4 sh -c 'echo start {}; sleep 3; echo done {}'"Fibrovascular
--max-procs and -P options are the same. On OpenWrt 19.07, neither option is available. I don't know if it is unavailable on BusyBox, or they removed it during compilation to save space. I had to write my custom script to handle this, which is quite lengthy, and probably slow compared to native C code.Alanalana
K
23

Quick and dirty solution: insert this line somewhere inside your for loop:

while [ $(jobs | wc -l) -ge 4 ] ; do sleep 1 ; done

(assumes you don't already have other background jobs running in the same shell)

Kenyakenyatta answered 28/6, 2011 at 21:57 Comment(3)
this might be a very wee bit more efficient with jobs -r - without it, you'll count the lines where bash tells you which jobs are already done.Cribbage
This solution is great. I just insert the line at the right place and boom! I liked that I didn't need to change my script structure at allAccordant
Should you have to use & to send the command inside do to the background? And the value seems to be always 0 for me, I don't know if it's why because this is itself inside another while loop ? How to see background process of other shells ?Neilla
F
12

I have found another solution for this question using parallel (part of moreutils package.)

parallel -j 4 -i bash -c "echo start {}; sleep 2; echo done {};" -- $(seq 10)

-j 4 stands for -j maxjobs

-i uses the parameters as {}

-- delimits your arguments

The output of this command will be:

start 3
start 4
start 1
start 2
done 4
done 2
done 3
done 1
start 5
start 6
start 7
start 8
done 5
done 6
start 9
done 7
start 10
done 8
done 9
done 10
Fibrovascular answered 30/6, 2011 at 21:30 Comment(0)
C
7

You can do something like this by using the jobs builtin:

for f in $TESTS; do
  running=($(jobs -rp))
  while [ ${#running[@]} -ge 4 ] ; do
    sleep 1   # this is not optimal, but you can't use wait here
    running=($(jobs -rp))
  done
  t=$[ ( $RANDOM % 5 )  + 1 ]
  sleep $t && echo $f $t &
done
wait
Cribbage answered 28/6, 2011 at 19:49 Comment(0)
E
7

GNU Parallel is designed for this kind of tasks:

TESTS="a b c d e"
for f in $TESTS; do
  t=$[ ( $RANDOM % 5 )  + 1 ]
  sem -j4 sleep $t && echo $f $t
done
sem --wait

Watch the intro videos to learn more:

http://www.youtube.com/playlist?list=PL284C9FF2488BC6D1

Enactment answered 5/1, 2012 at 10:0 Comment(0)
S
1

A generalized answer, useful when there is a (not too) big number of long running jobs, and using only POSIX shell features and the filesystem.

  1. Put all jobs as scripts into a jobs directory.
  2. Run a number of job processors to process them.

Here follows a very simple job processor, install it under the name queue:

#!/bin/sh

queue () {
    ID="$1" JOBS="$2"
    for JOB in "$JOBS"/*.job; do
    mv "$JOB" "$JOB.$ID" 2>/dev/null || continue
    sh "$JOB.$ID" && rm "$JOB.$ID"
    done
}

for JOB in $(seq $1); do  queue $JOB $2 &  done

queue 4 workdir starts four job processors for the directory workdir.

Answer to the question:

Create the five jobs:

for f in a b c d e; do
    t=$((RANDOM % 5 + 1))
    echo "sleep $t
    echo $f $t" > workdir/job$((I+=1)).job
done

Process them in four sub processes and wait

queue 4 workdir; wait

If a job fails, it will be left over in workdir/jobM.job.N for inspection or repetition. Mis the number of the job, N is the number of the queue which processed it.

Som answered 19/4 at 10:47 Comment(0)
C
0

This tested script runs 5 jobs at a time and will restart a new job as soon as it does (due to the kill of the sleep 10.9 when we get a SIGCHLD. A simpler version of this could use direct polling (change the sleep 10.9 to sleep 1 and get rid of the trap).

#!/usr/bin/bash

set -o monitor
trap "pkill -P $$ -f 'sleep 10\.9' >&/dev/null" SIGCHLD

totaljobs=15
numjobs=5
worktime=10
curjobs=0
declare -A pidlist

dojob()
{
  slot=$1
  time=$(echo "$RANDOM * 10 / 32768" | bc -l)
  echo Starting job $slot with args $time
  sleep $time &
  pidlist[$slot]=`jobs -p %%`
  curjobs=$(($curjobs + 1))
  totaljobs=$(($totaljobs - 1))
}

# start
while [ $curjobs -lt $numjobs -a $totaljobs -gt 0 ]
 do
  dojob $curjobs
 done

# Poll for jobs to die, restarting while we have them
while [ $totaljobs -gt 0 ]
 do
  for ((i=0;$i < $curjobs;i++))
   do
    if ! kill -0 ${pidlist[$i]} >&/dev/null
     then
      dojob $i
      break
     fi
   done
   sleep 10.9 >&/dev/null
 done
wait
Conventicle answered 28/6, 2011 at 20:5 Comment(0)
P
0

This is my "parallel" unzip loop using bash on AIX:

for z in *.zip ; do
  7za x $z >/dev/null
  while [ $(jobs -p|wc -l) -ge 4 ] ; do
    wait -n
  done
done

Notes:

  • jobs -p (bash function) lists jobs of immediate parent
  • wait -n (bash function) waits for any (one) background process to finish
Peary answered 5/5, 2022 at 11:10 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.