I am looking for a way to clean up the mess when my top-level script exits.
Especially if I want to use set -e
, I wish the background process would die when the script exits.
I am looking for a way to clean up the mess when my top-level script exits.
Especially if I want to use set -e
, I wish the background process would die when the script exits.
To clean up some mess, trap
can be used. It can provide a list of stuff executed when a specific signal arrives:
trap "echo hello" SIGINT
but can also be used to execute something if the shell exits:
trap "killall background" EXIT
It's a builtin, so help trap
will give you information (works with bash). If you only want to kill background jobs, you can do
trap 'kill $(jobs -p)' EXIT
Watch out to use single '
, to prevent the shell from substituting the $()
immediately.
kill $(jobs -p)
doesn't work in dash, because it executes command substitution in a subshell (see Command Substitution in man dash) –
Neolamarckism killall background
supposed to be a placeholder? background
is not in the man page... –
Circumnavigate jobs -p
to a temporary file and read it from there for kill
. –
Duky kill $(jobs -p)
is good, but prints usage info for 'kill' when there are no background jobs. IMHO, the best way for bash is jobs -p | xargs -r kill
–
Cretaceous EXIT
on ctrl-c
. Adding trap "exit" INT TERM ERR
along with trap "kill 0" EXIT
fixes this problem –
Marquise This works for me (improved thanks to the commenters):
trap "trap - SIGTERM && kill -- -$$" SIGINT SIGTERM EXIT
kill -- -$$
sends a SIGTERM to the whole process group, thus killing also descendants. The <PGID>
in kill -- -<PGID>
is the group process id, which often, but not necessarily, is the PID that $$
variable contains. The few times PGID and PID differ you can use ps
and other similar tools you can obtain the PGID, in your script.
For example: pgid="$(ps -o pgid= $$ | grep -o '[0-9]*')"
stores PGID in $pgid
.
Specifying signal EXIT
is useful when using set -e
(more details here).
-$$
. It evaluates to '-<PID>` eg -1234
. In the kill manpage // builtin manpage a leading dash specifies the signal to be sent. However -- probably blocks that, but then the leading dash is undocumented otherwise. Any help? –
Circumnavigate man 2 kill
, which explains that when a PID is negative, the signal is sent to all processes in the process group with the provided ID (en.wikipedia.org/wiki/Process_group). It's confusing that this is not mentioned in man 1 kill
or man bash
, and could be considered a bug in the documentation. –
Whatever kill -- -$$
might not terminate anything depending on how your script is called, e.g. if you call it like (sleep 100 & your_script)
in shell. On the other hand, if you use kill -- 0
in the trap, it would terminate sleep 100
and the enclosing shell, too. –
Duky trap 'trap " " SIGTERM; kill 0; wait; cleanup SIGINT SIGTERM
. Do you see any problem with this? –
Satterwhite SC2064: Use single quotes, otherwise this expands now rather than when signalled.
, should the answer be updated to reflect this? –
Builtin kill 0
would not have an effect. –
Duky $$
remains the same. –
Duky trap - SIGTERM
will reset the current script SIGTERM response to the default kill behavior. Then, when kill -- -$$
is executed, the current script will receive SIGTERM and exit normally. –
Puton enable -n kill
) or use /bin/kill
, as it may be that the builtin kill
doesn't support the -pgid
syntax. –
Lurie 5.2.15
and kill -SIGTERM -- -pid
appears to be wroking just fine. –
Valdis bash
mirror I learned that Bash 2.05 fixed support for -pid
. So either I was working with a really ancient bash system, or perhaps it was a mac with some neutered bash version, or I wasn't working with bash at all. –
Lurie || true
at the end or would get error during Dockerfile build. No process to kill due to docker triggering it I suppose. As a side note putting custom exit logic before calling trap i.e. myExit; trap...
works fine for me. –
Rennes To clean up some mess, trap
can be used. It can provide a list of stuff executed when a specific signal arrives:
trap "echo hello" SIGINT
but can also be used to execute something if the shell exits:
trap "killall background" EXIT
It's a builtin, so help trap
will give you information (works with bash). If you only want to kill background jobs, you can do
trap 'kill $(jobs -p)' EXIT
Watch out to use single '
, to prevent the shell from substituting the $()
immediately.
kill $(jobs -p)
doesn't work in dash, because it executes command substitution in a subshell (see Command Substitution in man dash) –
Neolamarckism killall background
supposed to be a placeholder? background
is not in the man page... –
Circumnavigate jobs -p
to a temporary file and read it from there for kill
. –
Duky kill $(jobs -p)
is good, but prints usage info for 'kill' when there are no background jobs. IMHO, the best way for bash is jobs -p | xargs -r kill
–
Cretaceous EXIT
on ctrl-c
. Adding trap "exit" INT TERM ERR
along with trap "kill 0" EXIT
fixes this problem –
Marquise Update: https://mcmap.net/q/108000/-how-do-i-kill-background-processes-jobs-when-my-shell-script-exits improves this by adding exit status and a cleanup function.
trap "exit" INT TERM
trap "kill 0" EXIT
Why convert INT
and TERM
to exit? Because both should trigger the kill 0
without entering an infinite loop.
Why trigger kill 0
on EXIT
? Because normal script exits should trigger kill 0
, too.
Why kill 0
? Because nested subshells need to be killed as well. This will take down the whole process tree.
GNU bash, version 4.3.33(1)-release (x86_64-apple-darwin14.0.0)
. –
Beebread kill 0
means/does? –
Intermolecular kill 0
–
Spadework kill 0
to just the shell script's processes: run the script under setsid -w
(which has other side effects, unfortunately); #6550163 –
Hubblebubble QUIT
as well to the traps. –
Pahari trap "exit" INT TERM
trap "sleep 0.1; kill 0" EXIT
–
Enthronement The trap 'kill 0' SIGINT SIGTERM EXIT
solution described in @tokland's answer is really nice, but latest Bash crashes with a segmentation fault when using it. That's because Bash, starting from v. 4.3, allows trap recursion, which becomes infinite in this case:
SIGINT
or SIGTERM
or EXIT
;kill 0
, which sends SIGTERM
to all processes in the group, including the shell itself;This can be worked around by manually de-registering the trap:
trap 'trap - SIGTERM && kill 0' SIGINT SIGTERM EXIT
The more fancy way that allows printing the received signal and avoids "Terminated:" messages:
#!/usr/bin/env bash
trap_with_arg() { # from https://mcmap.net/q/110297/-is-it-possible-to-detect-which-trap-signal-in-bash-duplicate
local func="$1"; shift
for sig in "$@"; do
trap "$func $sig" "$sig"
done
}
stop() {
trap - SIGINT EXIT
printf '\n%s\n' "received $1, killing child processes"
kill -s SIGINT 0
}
trap_with_arg 'stop' EXIT SIGINT SIGTERM SIGHUP
{ i=0; while (( ++i )); do sleep 0.5 && echo "a: $i"; done } &
{ i=0; while (( ++i )); do sleep 0.6 && echo "b: $i"; done } &
while true; do read; done
UPD: added a minimal example; improved stop
function to avoid de-trapping unnecessary signals and to hide "Terminated:" messages from the output. Thanks Trevor Boyd Smith for the suggestions!
stop()
you provide the first argument as the signal number but then you hardcode what signals are being deregistered. rather than hardcode the signals being deregistered you could use the first argument to deregister in the stop()
function (doing so would potentially stop other recursive signals (other than the 3 hardcoded)). –
Radix SIGINT
, but kill 0
sends SIGTERM
, which will get trapped once again. This will not produce infinite recursion, though, because SIGTERM
will be de-trapped during the second stop
call. –
Eelgrass trap - $1 && kill -s $1 0
should work better. I'll test and update this answer. Thank you for the nice idea! :) –
Eelgrass trap - $1 && kill -s $1 0
woldn't work too, as we can't kill with EXIT
. But it is really sufficient do de-trap TERM
, because kill
sends this signal by default. –
Eelgrass EXIT
, the trap
signal-handler is always only executed once. –
Radix stop
doesn't require modification, regardless of the signal list. And it seems that, when the shell gets killed with INT
, it doesn't print "Terminated:" messages (which it does in case of TERM
). –
Eelgrass kill -s EXIT pid
, which stop
will try to do if it was written like trap - $1 && kill -s $1 0
and invoked by EXIT
. –
Eelgrass /bin/sh
symlinked to dash
produces the error trap: SIGINT: bad trap
. Removing SIG
prefix from signal names (such that SIGINT
becomes INT
, etc.) works as expected with both dash
and bash
. –
Pitzer "recieved $1, killing children"
meant "earning a dollar and murdering kids". You might wanna reword that. –
Photima trap - SIGINT EXIT
command clears the trap so the stop
function doesn't get called recursively when the process finally exits. –
Eelgrass trap 'kill $(jobs -p)' EXIT
I would make only minor changes to Johannes' answer and use jobs -pr to limit the kill to running processes and add a few more signals to the list:
trap 'kill $(jobs -pr)' SIGINT SIGTERM EXIT
To be on the safe side I find it better to define a cleanup function and call it from trap:
cleanup() {
local pids=$(jobs -pr)
[ -n "$pids" ] && kill $pids
}
trap "cleanup" INT QUIT TERM EXIT [...]
or avoiding the function altogether:
trap '[ -n "$(jobs -pr)" ] && kill $(jobs -pr)' INT QUIT TERM EXIT [...]
Why? Because by simply using trap 'kill $(jobs -pr)' [...]
one assumes that there will be background jobs running when the trap condition is signalled. When there are no jobs one will see the following (or similar) message:
kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ... or kill -l [sigspec]
because jobs -pr
is empty - I ended in that 'trap' (pun intended).
[ -n "$(jobs -pr)" ]
doesn't work on my bash. I use GNU bash, version 4.2.46(2)-release (x86_64-redhat-linux-gnu). The "kill: usage" message keeps popping up. –
Pavlov jobs -pr
doesn't return the PIDs of the children of the background processes. It doesn't tear the entire process tree down, only trims off the roots. –
Pavlov function cleanup_func {
sleep 0.5
echo cleanup
}
trap "exit \$exit_code" INT TERM
trap "exit_code=\$?; cleanup_func; kill 0" EXIT
# exit 1
# exit 0
Like https://mcmap.net/q/108000/-how-do-i-kill-background-processes-jobs-when-my-shell-script-exits, but with added exit-code
exit_code
come from in INT TERM
trap? –
Duky EXIT
trap is always invoked when the script exits. It sets the global variable exit_code
to the exit code of the last command executed. After running cleanup_func
, it then sends SIGTERM
"to every process in the process group of the calling process", including itself (see kill(2)
). SIGTERM
is trapped and exits with $exit_code
. Now, if you press ^C
during script execution, $exit_code
in the INT trap will be empty and exit
will be invoked with the exit code of the last command: 130
(see help exit
). exit
triggers the EXIT
trap: start from the top. –
Danelledanete A nice version that works under Linux, BSD and MacOS X. First tries to send SIGTERM, and if it doesn't succeed, kills the process after 10 seconds.
KillJobs() {
for job in $(jobs -p); do
kill -s SIGTERM $job > /dev/null 2>&1 || (sleep 10 && kill -9 $job > /dev/null 2>&1 &)
done
}
TrapQuit() {
# Whatever you need to clean here
KillJobs
}
trap TrapQuit EXIT
Please note that jobs does not include grand children processes.
I made an adaption of @tokland's answer combined with the knowledge from http://veithen.github.io/2014/11/16/sigterm-propagation.html when I noticed that trap
doesn't trigger if I'm running a foreground process (not backgrounded with &
):
#!/bin/bash
# killable-shell.sh: Kills itself and all children (the whole process group) when killed.
# Adapted from http://stackoverflow.com/a/2173421 and http://veithen.github.io/2014/11/16/sigterm-propagation.html
# Note: Does not work (and cannot work) when the shell itself is killed with SIGKILL, for then the trap is not triggered.
trap "trap - SIGTERM && echo 'Caught SIGTERM, sending SIGTERM to process group' && kill -- -$$" SIGINT SIGTERM EXIT
echo $@
"$@" &
PID=$!
wait $PID
trap - SIGINT SIGTERM EXIT
wait $PID
Example of it working:
$ bash killable-shell.sh sleep 100
sleep 100
^Z
[1] + 31568 suspended bash killable-shell.sh sleep 100
$ ps aux | grep "sleep"
niklas 31568 0.0 0.0 19640 1440 pts/18 T 01:30 0:00 bash killable-shell.sh sleep 100
niklas 31569 0.0 0.0 14404 616 pts/18 T 01:30 0:00 sleep 100
niklas 31605 0.0 0.0 18956 936 pts/18 S+ 01:30 0:00 grep --color=auto sleep
$ bg
[1] + 31568 continued bash killable-shell.sh sleep 100
$ kill 31568
Caught SIGTERM, sending SIGTERM to process group
[1] + 31568 terminated bash killable-shell.sh sleep 100
$ ps aux | grep "sleep"
niklas 31717 0.0 0.0 18956 936 pts/18 S+ 01:31 0:00 grep --color=auto sleep
I finally have found a solution that appears to work in all cases to kill all descents recursively regardless of whether they are jobs, or sub-processes. The other solutions here all seemed to fail with things such as:
while ! ffmpeg ....
do
sleep 1
done
In my situation, ffmpeg would keep running after the parent script exited.
I found a solution here to recursively getting the PIDs of all child processes recursively and used that in the trap handler thus:
cleanup() {
# kill all processes whose parent is this process
kill $(pidtree $$ | tac)
}
pidtree() (
[ -n "$ZSH_VERSION" ] && setopt shwordsplit
declare -A CHILDS
while read P PP;do
CHILDS[$PP]+=" $P"
done < <(ps -e -o pid= -o ppid=)
walk() {
echo $1
for i in ${CHILDS[$1]};do
walk $i
done
}
for i in "$@";do
walk $i
done
)
trap cleanup EXIT
The above placed at the start of a bash script succeeds in killing all child processes. Note that pidtree is called with $$ which is the PID of the bash script that is exiting and the list of PIDs (one per line) is reversed using tac to try and ensure that prarent processes are killed only after their children to avoid possible race conditions in loops such as the example I gave.
None of the answers here worked for me in the case of a continuous integration (CI) script that starts background processes from subshells. For example:
(cd packages/server && npm start &)
The subshell terminates after starting the background process, which therefore ends up with parent PID 1.
With PPID not an option, the only portable (Linux and MacOS) and generic (independent of process name, listening ports, etc.) approach left is the process group (PGID). However, I can't just kill that because it would kill the script process, which would fail the CI job.
# Terminate the given process group, excluding this process. Allows 2 seconds
# for graceful termination before killing remaining processes. This allows
# shutdown errors to be printed, while handling processes that fail to
# terminate quickly.
kill_subprocesses() {
echo "Terminating subprocesses of PGID $1 excluding PID $$"
# Get all PIDs in this process group except this process
# (pgrep on NetBSD/MacOS does this by default, but Linux pgrep does not)
# Uses a heredoc instead of piping to avoid including the grep PID
pids=$(grep -Ev "\\<$$\\>" <<<"$(pgrep -g "$1")")
if [ -n "$pids" ]; then
echo "Terminating processes: ${pids//$'\n'/, }"
# shellcheck disable=SC2086
kill $pids || true
fi
sleep 2
# Check for remaining processes and kill them
pids=$(grep -Ev "\\<$$\\>" <<<"$(pgrep -g "$1")")
if [ -n "$pids" ]; then
echo "Killing remaining processes: ${pids//$'\n'/, }"
# shellcheck disable=SC2086
kill -9 $pids || true
fi
}
# Terminate subprocesses on exit or interrupt
# shellcheck disable=SC2064
trap "kill_subprocesses $$" EXIT SIGINT SIGTERM
Another option is it to have the script set itself as the process group leader, and trap a killpg on your process group on exit.
EDIT: a possible bash hack to create a new process group is to use setsid(1) but only if we're not already the process group leader (can query it with ps
).
Placing this at the beginning of the script can achieve that.
# Create a process group and exec the script as its leader if necessary
[[ "$(ps -o pgid= $$)" -eq "$$" ]] || exec setsid /bin/bash "$0" "$@"
Then signaling the process group with kill -- -$$
would work as expected even when script is not already the process group leader.
kill -- -$$
and kill 0
answers suggest; starting a new proccess group is the novel idea here but needs details on how to do this from bash... –
Tribute setsid(1)
can do it, and we can test whether we're the leader with ps
. So the bash hack would be to add something like this to the beginning of the script ` [[ "$(ps -o pgid= $$)" -eq "$$" ]] || exec setsid /bin/bash "$0" "$@"` –
Rodrich jobs -p does not work in all shells if called in a sub-shell, possibly unless its output is redirected into a file but not a pipe. (I assume it was originally intended for interactive use only.)
What about the following:
trap 'while kill %% 2>/dev/null; do jobs > /dev/null; done' INT TERM EXIT [...]
The call to "jobs" is needed with Debian's dash shell, which fails to update the current job ("%%") if it is missing.
trap 'echo in trap; set -x; trap - TERM EXIT; while kill %% 2>/dev/null; do jobs > /dev/null; done; set +x' INT TERM EXIT; sleep 100 & while true; do printf .; sleep 1; done
If you run it in Bash (5.0.3) and try to terminate, there seems to be an infinite loop. However, if you terminate it again, it works. Even by Dash (0.5.10.2-6) you have to terminate it twice. –
Duky Just for diversity I will post variation of https://mcmap.net/q/108000/-how-do-i-kill-background-processes-jobs-when-my-shell-script-exits , because that solution leads to message "Terminated" in my environment:
trap 'test -z "$intrap" && export intrap=1 && kill -- -$$' SIGINT SIGTERM EXIT
Universal solution which works also in sh
(jobs
there does not output anything to stdout):
trap "pkill -P $$" EXIT INT
-g
would do that. –
Closer So script the loading of the script. Run a killall
(or whatever is available on your OS) command that executes as soon as the script is finished.
© 2022 - 2024 — McMap. All rights reserved.
p=$(bash -c 'sleep 2 >/dev/null & echo $!'); sleep 1; ps -f -p "$p"
to see thatsleep 2
command is still running afterbash
has exited. – Dukysleep 2
command is running in background as a separate process; its command ends with&
. – Duky