C++ graceful shutdown best practices
Asked Answered
O

3

11

I'm writing a multi-threaded c++ application for *nix operating systems. What are some best practices for terminating such an application gracefully? My instinct is that I'd want to install a signal handler on SIGINT (SIGTERM?) which stops/joins my threads. Also, is it possible to "guarantee" that all destructors are called (provided no other errors or exceptions are thrown while handling the signal)?

Olympias answered 28/12, 2013 at 22:57 Comment(2)
Good question...I'm curious about the answer myself.Mathematics
I think it's not a bad idea to use a global shutdown flag which you check in the main event loop, or whatever you're normally using for synchronization. To guarantee that all destructors are called, you need to unwind each thread's call stack, assuming that you use RAII for heap-allocated objects. There's really no silver bullet solution to this, the crudest one would be to throw an exception and then catch it in the thread's main function.Whitherward
R
4

Some considerations come to mind:

  • designate 1 thread to be responsible for orchestrating the shutdown, eg, as Dithermaster suggested, this could be the main thread if you are writing a standalone application. Or if you are writing a library, provide an interface (eg function call) whereby a client program can terminate the objects created within the library.

  • you cannot guarantee destructors are called; that is up to you, and requires carefully calling delete for each new. Maybe smart pointers will help you. But, really, this is a design consideration. The major components should have start & stop semantics, which you could choose to invoke from the class constructor & destructor.

  • the shutdown sequence for a set of interacting objects is something that can require some effort to get correct. E.g., before you delete an object, are you sure some timer mechanism is not going to try calling it in few micro/milli/seconds later? Trial and error is your friend here; develop a framework which can repeatedly & rapidly start and stop your application to tease out shutdown related race-conditions.

  • signals are one way to trigger an event; others might be periodically polling for a known file, or opening a socket and receiving some data on it. Either way, you want to decouple the shutdown sequence code from the trigger event.

Rexanne answered 28/12, 2013 at 23:30 Comment(0)
M
2

My recommendation is that the main thread shut down all worker threads before exiting itself. Send each worker an event telling it to clean up and exit, and wait for each one to do so. This will allow all C++ destructors to run.

Moth answered 28/12, 2013 at 23:11 Comment(0)
W
1

Regarding signal management, the only thing you can portably and safely do inside a signal handler is to write to a variable of type sig_atomic_t (possibly volatile-qualified) and return. In general, you cannot call most functions and must not write to global memory. In other words, the handler should just set a flag to be tested inside your main routine, at some point you find appropriate, and the action resulting from the signal itself should be performed from there.

(Since there might be blocking I/O involved, consider studying POSIX Thread Cancellation. Your Unix clone (most notably Linux) might have peculiarities with respect to this and to the above.)

Regarding destructors, no magic is involved. They will be executed if control leaves a given scope through any means defined in the language. Leaving a scope through other means (for example, longjmp() or even exit()) does not trigger destructors.

Regarding general shutdown practices, there are divergent opinions on the field.

Some state that a "graceful termination", in the sense of releasing every resource ever allocated, should be performed. In C++, this usually means that all destructors should be properly executed before the process terminates. This is tricky in practice and often a source of much grief, specially in multithreaded programs, for a variety of reasons. Signals further complicate things by the very nature of asynchronous signal dispatching.

Because most of this work is totally useless, some others, like me, contend that the program must just terminate immediately, possibly shortly after undoing persistent changes to the system (like removing temporary files or restoring the screen resolution) and saving configuration. An apparently tidier cleanup is not only a waste of time (because the operating system will clean up most things like allocated memory, dangling threads and open file descriptors), but might be a serious waste of time (deallocators might touch paged out memory, uselessly forcing the system to page them in just for releasing them soon after the process terminates, for example), not mentioning the possibility of deadlocks being originated from joining threads.

Just say no. When you want to leave, call exit() (or even _exit(), but watch out for unflushed I/O) and that's it. More annoying than slow starting programs are slow terminating programs.

Wertz answered 29/12, 2013 at 1:24 Comment(2)
Isn't it also safe to raise signals to other threads? longjmp is also allowed inside signal handlers, and probably throw as well.Lush
@BenVoigt: Sure, that's why I said in general. Any signal-safe function can be called, but as per POSIX.1 (in SUSv2 the same is stated under the documentation entry for signal()), the behavior is undefined if the handler refers to static memory other than writing to a sig_atomic_t (or errno in later revisions). This makes it impossible to call most functions in an useful manner, including pthread_kill() and longjmp(), for obvious reasons. Finally, I'm not aware of the semantics of throw inside a signal handler with respect to multithreaded POSIX applications.Wertz

© 2022 - 2024 — McMap. All rights reserved.