In what order should I send signals to gracefully shutdown processes?
Asked Answered
C

7

119

In a comment on this answer of another question, the commenter says:

don’t use kill -9 unless absolutely necessary! SIGKILL can’t be trapped so the killed program can’t run any shutdown routines to e.g. erase temporary files. First try HUP (1), then INT (2), then QUIT (3)

I agree in principle about SIGKILL, but the rest is news to me. Given that the default signal sent by kill is SIGTERM, I would expect it is the most-commonly expected signal for graceful shutdown of an arbitrary process. Also, I have seen SIGHUP used for non-terminating reasons, such as telling a daemon "re-read your config file." And it seems to me that SIGINT (the same interrupt you'd typically get with Ctrl-C, right?) isn't as widely supported as it ought to be, or terminates rather ungracefully.

Given that SIGKILL is a last resort — Which signals, and in what order, should you send to an arbitrary process, in order to shut it down as gracefully as possible?

Please substantiate your answers with supporting facts (beyond personal preference or opinion) or references, if you can.

Note: I am particularly interested in best practices that include consideration of bash/Cygwin.

Edit: So far, nobody seems to mention INT or QUIT, and there's limited mention of HUP. Is there any reason to include these in an orderly process-killing?

Catheycathi answered 27/3, 2009 at 16:17 Comment(1)
If you have to resort to SIGKILL to really kill a process, I would consider it a bug in the program.Bondwoman
H
154

SIGTERM tells an application to terminate. The other signals tell the application other things which are unrelated to shutdown but may sometimes have the same result. Don't use those. If you want an application to shut down, tell it to. Don't give it misleading signals.

Some people believe the smart standard way of terminating a process is by sending it a slew of signals, such as HUP, INT, TERM and finally KILL. This is ridiculous. The right signal for termination is SIGTERM and if SIGTERM doesn't terminate the process instantly, as you might prefer, it's because the application has chosen to handle the signal. Which means it has a very good reason to not terminate immediately: It's got cleanup work to do. If you interrupt that cleanup work with other signals, there's no telling what data from memory it hasn't yet saved to disk, what client applications are left hanging or whether you're interrupting it "mid-sentence" which is effectively data corruption.

For more information on what the real meaning of the signals is, see sigaction(2). Don't confuse "Default Action" with "Description", they are not the same thing.

SIGINT is used to signal an interactive "keyboard interrupt" of the process. Some programs may handle the situation in a special way for the purpose of terminal users.

SIGHUP is used to signal that the terminal has disappeared and is no longer looking at the process. That is all. Some processes choose to shut down in response, generally because their operation makes no sense without a terminal, some choose to do other things such as recheck configuration files.

SIGKILL is used to forcefully remove the process from the kernel. It is special in the sense that it's not actually a signal to the process but rather gets interpreted by the kernel directly.

Don't send SIGKILL. - SIGKILL should certainly never be sent by scripts. If the application handles the SIGTERM, it can take it a second to cleanup, it can take a minute, it can take an hour. Depending on what the application has to get done before it's ready to end. Any logic that "assumes" an application's cleanup sequence has taken long enough and needs to be shortcut or SIGKILLed after X seconds is just plain wrong.

The only reason why an application would need a SIGKILL to terminate, is if something bugged out during its cleanup sequence. In which case you can open a terminal and SIGKILL it manually. Aside from that, the only one other reason why you'd SIGKILL something is because you WANT to prevent it from cleaning itself up.

Even though half the world blindly sends SIGKILL after 5 seconds it's still horribly wrong thing to do.

Hanseatic answered 27/3, 2009 at 17:7 Comment(5)
You're right that there's a lot of misuse of SIGKILL out there. But there is a time and place to use it, even from a script. Many, many apps trap SIGTERM and exit gracefully in less than a second or within just a few seconds, and one of those is still running 30 secs later it's because it's wedged.Thermic
@dwc: Try letting it run once for an hour. If it doesn't die then it's "wedged" and either fix it, or be lazy and in the future SIGKILL it after some time. Take note that you're probably corrupting stuff and remember that this is NOT something you should be doing "by default".Hanseatic
@lhunath: Hope you don't mind, I rearranged your paragraphs in order to make the answer more directly and clearly follow from the question. The anti-SIGKILL rant is good stuff, but a secondary point. Thanks again for an excellent & educational answer.Catheycathi
Don't send SIGKILL. Ever. Just plain wrong. Really? Even if your system is already burning thanks to infinite loops. Good luck. -1Smoot
The only time I care to signal a program to stop is when it's not responding, so I'd gladly send SIGKILL before anything else, and I don't care what it does with its data since I'm going to restart it anyway. There's a lot of bugs out there, unfortunately.Simpson
P
29

Short Answer: Send SIGTERM, 30 seconds later, SIGKILL. That is, send SIGTERM, wait a bit (it may vary from program to program, you may know your system better, but 5 to 30 seconds is enough. When shutting down a machine, you may see it automatically waiting up to 1'30s. Why the hurry, after all?), then send SIGKILL.

Reasonable Answer: SIGTERM, SIGINT, SIGKILL This is more than enough. The process will very probably terminate before SIGKILL.

Long Answer: SIGTERM, SIGINT, SIGQUIT, SIGABRT, SIGKILL

This is unnecessary, but at least you are not misleading the process regarding your message. All these signals do mean you want the process to stop what it is doing and exit.

No matter what answer you choose from this explanation, keep that in mind!

If you send a signal that means something else, the process may handle it in very different ways (on one hand). On the other hand, if the process doesn't handle the signal, it doesn't matter what you send after all, the process will quit anyway (when the default action is to terminate, of course).

So, you must think as yourself as a programmer. Would you code a function handler for, lets say, SIGHUP to quit a program that connects with something, or would you loop it to try to connect again? That is the main question here! That is why it is important to just send signals that mean what you intend.

Almost Stupid Long Answer:

The table bellow contains the relevant signals, and the default actions in case the program does not handle them.

I ordered them in the order I suggest to use (BTW, I suggest you to use the reasonable answer, not this one here), if you really need to try them all (it would be fun to say the table is ordered in terms of the destruction they may cause, but that is not completely true).

The signals with an asterisk (*) are NOT recommended. The important thing about these is that you may never know what it is programmed to do. Specially SIGUSR! It may start the apocalipse (it is a free signal for a programmer do whatever he/she wants!). But, if not handled OR in the unlikely case it is handled to terminate, the program will terminate.

In the table, the signals with default options to terminate and generate a core dump are left in the end, just before SIGKILL.

Signal     Value     Action   Comment
----------------------------------------------------------------------
SIGTERM      15       Term    Termination signal
SIGINT        2       Term    Famous CONTROL+C interrupt from keyboard
SIGHUP        1       Term    Disconnected terminal or parent died
SIGPIPE      13       Term    Broken pipe
SIGALRM(*)   14       Term    Timer signal from alarm
SIGUSR2(*)   12       Term    User-defined signal 2
SIGUSR1(*)   10       Term    User-defined signal 1
SIGQUIT       3       Core    CONTRL+\ or quit from keyboard
SIGABRT       6       Core    Abort signal from abort(3)
SIGSEGV      11       Core    Invalid memory reference
SIGILL        4       Core    Illegal Instruction
SIGFPE        8       Core    Floating point exception
SIGKILL       9       Term    Kill signal

Then I would suggest for this almost stupid long answer: SIGTERM, SIGINT, SIGHUP, SIGPIPE, SIGQUIT, SIGABRT, SIGKILL

And finally, the

Definitely Stupid Long Long Answer:

Don't try this at home.

SIGTERM, SIGINT, SIGHUP, SIGPIPE, SIGALRM, SIGUSR2, SIGUSR1, SIGQUIT, SIGABRT, SIGSEGV, SIGILL, SIGFPE and if nothing worked, SIGKILL.

SIGUSR2 should be tried before SIGUSR1 because we are better off if the program doesn't handle the signal. And it is much more likely for it to handle SIGUSR1 if it handles just one of them.

BTW, the KILL: it is not wrong to send SIGKILL to a process, as other answer stated. Well, think what happens when you send a shutdown command? It will try SIGTERM and SIGKILL only. Why do you think that is the case? And why do you need any other signals, if the very shutdown command uses only these two?


Now, back to the long answer, this is a nice oneliner:

for SIG in 15 2 3 6 9 ; do echo $SIG ; echo kill -$SIG $PID || break ; sleep 30 ; done

It sleeps for 30 seconds between signals. Why else would you need a oneliner? ;)

Also, recommended: try it with only signals 15 2 9 from the reasonable answer.

safety: remove the second echo when you are ready to go. I call it my dry-run for onliners. Always use it to test.


Script killgracefully

Actually I was so intrigued by this question that I decided to create a small script to do just that. Please, feel free to download (clone) it here:

GitHub link to Killgracefully repository

Peri answered 8/6, 2016 at 2:47 Comment(3)
I've always wondered why the default action for SIGUSRx is to terminate. If a program sends that signal, it's because it wants the target to do something specific. If the target isn't actually programmed to do anything when it receives that signal, then sure, it's not going to be able to do what the process that sent the signal intends, but how does that imply it's desirable for the process to immediately terminate? That's kind of like designing an OS to immediately shut down the computer if it receives data from a USB device it doesn't have a driver for.Pull
Why would you ever send a SIGPIPE if is not a pipe, like suggested in the "almost stupid long answer"?Pashm
As I'm sure you read it carefully already, it is NOT recommended, but it is there to show a list of possible orders only. But again, not recommended.Peri
T
8

Typically you'd send SIGTERM, the default of kill. It's the default for a reason. Only if a program does not shutdown in a reasonable amount of time should you resort to SIGKILL. But note that with SIGKILL the program has no possibility to clean things up und data could be corrupted.

As for SIGHUP, HUP stands for "hang up" and historically meant that the modem disconnected. It's essentially equivalent to SIGTERM. The reason that daemons sometimes use SIGHUP to restart or reload config is that daemons detach from any controlling terminals as a daemon doesn't need those and therefore would never receive SIGHUP, so that signal was considered as "freed up" for general use. Not all daemons use this for reload! The default action for SIGHUP is to terminate and many daemons behave that way! So you can't go blindly sending SIGHUPs to daemons and expecting them to survive.

Edit: SIGINT is probably inappropriate to terminate a process, as it's normally tied to ^C or whatever the terminal setting is to interrupt a program. Many programs capture this for their own purposes, so it's common enough for it not to work. SIGQUIT typically has the default of creating a core dump, and unless you want core files laying around it's not a good candidate, either.

Summary: if you send SIGTERM and the program doesn't die within your timeframe then send it SIGKILL.

Thermic answered 27/3, 2009 at 16:24 Comment(5)
Note that following it up with SIGKILL should only be done in situations where shutting down instantly is a higher priority than preventing data loss/data corruption.Muster
@Thermic I did not understand the following point in your answer. could you please help "The reason that daemons sometimes use SIGHUP to restart or reload config is that daemons detach from any controlling terminals and therefore would never receive SIGTERM, so that signal was considered as "freed up" for general use."Ardyce
@Ardyce Let me try: SIGHUP is the "hang up" signal which tells a process that the terminal got disconnected. Since daemons run in the background, they don't need terminals. That means that a "hang up" signal isn't relevant to daemons. They'll never receive it from a terminal disconnection, since they don't have terminals connected in the first place. And since the signal is defined anyway, though they don't need it for the original purpose, many daemons use it instead for a different purpose, such as re-reading their config files.Catheycathi
Re: "SIGINT is probably in appropriate", I have a process (not mine, closed source, etc) which consistently will not respond to SIGTERM, but responds to SIGINT. I chose to use SIGINT in my script, because it seemed safer than resorting to SIGKILL. Is that wrong?Dodi
@Dodi these signals are merely conventions, if it doesn't respond appropriately to SIGTERM then you should choose a signal that it does respond toSimpson
R
7

SIGTERM actually means sending an application a message: "would you be so kind and commit suicide". It can be trapped and handled by application to run cleanup and shutdown code.

SIGKILL cannot be trapped by application. Application gets killed by OS without any chance for cleanup.

It's typical to send SIGTERM first, sleep some time, then send SIGKILL.

Reject answered 27/3, 2009 at 16:53 Comment(3)
I suppose polling would be a bit more efficient than sleeping (before the SIGKILL)Jasmine
@OhadSchneider it would, but that would require something more than simple bash command.Reject
Yeah I guess you would need to loop while the process is still alive using something like this: https://mcmap.net/q/55225/-how-to-check-if-a-process-id-pid-exists.Jasmine
T
4
  • SIGTERM is equivalent to "clicking the 'X' " in a window.
  • SIGTERM is what Linux uses first, when it is shutting down.
Transudate answered 27/3, 2009 at 16:20 Comment(1)
"SIGTERM is equivalent to "clicking the 'X' " in a window" No, it is not, because any one application can easily open any number of (document and tool, for example) windows, let alone dialogs, and it may not even respond to a last window close command as it does to an exit command (I can't think of any obvious examples, but while non-obvious, there's no reason why it can't be done that way). SIGTERM is (or should be) equivalent to gracefully asking the application to terminate, however that might be performed in that particular application.Maxinemaxiskirt
J
4

With all the discussion going on here, no code has been offered. Here's my take:

#!/bin/bash

$pid = 1234

echo "Killing process $pid..."
kill $pid

waitAttempts=30 
for i in $(seq 1 $waitAttempts)
do
    echo "Checking if process is alive (attempt #$i / $waitAttempts)..."
    sleep 1

    if ps -p $pid > /dev/null
    then
        echo "Process $pid is still running"
    else
        echo "Process $pid has shut down successfully"
        break
    fi
done

if ps -p $pid > /dev/null
then
    echo "Could not shut down process $pid gracefully - killing it forcibly..."
    kill -SIGKILL $pid
fi
Jasmine answered 26/1, 2017 at 19:36 Comment(0)
C
0

HUP sounds like rubbish to me. I'd send it to get a daemon to re-read its configuration.

SIGTERM can be intercepted; your daemons just might have clean-up code to run when it receives that signal. You cannot do that for SIGKILL. Thus with SIGKILL you are not giving the daemon's author any options.

More on that on Wikipedia

Cacodyl answered 27/3, 2009 at 16:26 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.