How to use exit() safely from any thread
Asked Answered
H

3

8

According to the man page (2) the exit function is not thread safe : MT-Unsafe race:exit, this is because this function tries to clean up resources (flush data to the disk, close file descriptors, etc...) by calling callbacks registered using on_exit and atexit. And I want my program to do that ! (one of my thread keeps a fd open during the whole program's lifespan so _exit is not an option for me because I want all the data to be written to the output file)

My question is the following : if I'm being careful and I don't share any sensible data (like a fd) between my threads, is it "acceptable" to call exit in a multi-threaded program ? Note that I'm only calling exit if an unrecoverable error occurs. Yet, I can't afford having a segfault while the program tries to exit. The thing is, an unrecoverable error can happen from any thread...

I was thinking about using setjmp/longjmp to kill my threads "nicely" but this would be quite complex to do and would require many changes everywhere in my code.

Any suggestions would be greatly appreciated. Thanks ! :)

EDIT : Thanks to @Ctx enlightenment, I came up with the following idea :

#define EXIT(status) do { pthread_mutex_lock(&exit_mutex); exit(status); } while(0)

Of course the exit_mutex must be global (extern).

Hyperkeratosis answered 23/7, 2019 at 10:4 Comment(16)
If the program needs to be able to cleanly shut down in any case design it like this. Reading your question I have the impression you did not thought about that until just now facing what you call an "Unrecoverable Error", which, to be honest, I doubt there were. And if it were, you probably had other problems then bringing down your program cleanly ...;)Gibbous
one of my thread keeps a fd open during the whole program's lifespan so _exit is not an option for me because I want all the data to be written to the output file Integer-type file descriptors do not buffer data in the process's address space. Any write() (or similar) call that's completed when you call _exit() will be in the file it was written to (barring things like power failures that crash the system before the page cache is flushed to disk...) exit() only causes problems with FILE * based streams that buffer data.Quirinus
@Gibbous I have many legit unrecoverable errors that can happen. However I've just realized now that exit() is not thread-safe. @AndrewHenle Good to know ! Thanks ! But I still think _exit is kind of gross anyway... Better clean what's cleanable :)Hyperkeratosis
@Hyperkeratosis Better clean what's cleanable Even better is to not make any messes that need cleaning in the first place. Ideally, any process should be able to take a SIGKILL or a power failure at any moment and be perfectly happy when it's restarted. You're spending how much time trying to make your cleanup robust? And even if you do that, you don't protect against a power failure or incorrectly aimed SIGKILL.Quirinus
@AndrewHenle Of course I can't predict everything but I try to make it as robust as possible... TBH, my main concern is to exit my program without receiving any SIGSEGV (that's why I'm making sure exit will not mess with my thread safety), now you are probably tempted to mention _exit again, but the thing is I don't know every underlying Linux concepts and calling exit safely instead of the brutal _exit is a way for me to know I'm not making anything stupid and that resources will be cleaned up. Don't know if I'm making sense rn... :sweat_smile:Hyperkeratosis
Oh and btw ! You said exit() only causes problems with FILE * based streams that buffer data did you mean _exit ? Because I'm using FILE* everywhere :xHyperkeratosis
exit() will flush all your open FILE * streams. That can be a problem if, for example, FILE * structures have internal locks of any kind and one thread is holding that lock for some reason. Then the call to exit() will hang. (This is one reason why exit() is not async-signal-safe and can't be safely called from within a signal handler.) Can any of your uses of a FILE * hold such a lock indefinitely? Perhaps by blocking for input on a read/write FILE * stream?Quirinus
Well... If I'm not mistaken, fread/fwrite calls are blocking (there's no O_NONBLOCK flag when using fopen) but I don't see why it would prevent an exit() from finishing... But tbh I don't really know how the FILE struct works internally, I might be wrong here.Hyperkeratosis
Can you give one good example of what you mean by an "unrecoverable error"? I'm having a very hard time imagining what your actual use case is.Expeditious
@Hyperkeratosis Since fread must return data (fill the buffer) until EOF is reached, of course it will block until data is available. There is no way for this interface to even express "interrupted, please retry now" not "nothing for now, retry a bit later". Nor is there any way for an stdio write function to do that, but as long as the user buffer is copied elsewhere, and reusable by the user, the function doesn't need to block. Reading!=writing in this regard. Completed stdio write does not mean the data is on safe/stable storage.Disfavor
@AndrewHenle What if exit() destroys the stdio data structures?Disfavor
"close file descriptors" You don't need to; it's implicit in the destruction of a process (std Unix semantics).Disfavor
@DavidSchwartz : I'm writting encrypted stuff on the disk, if an openssl error occurs, what do you want me to do ? Write it unencrypted ? No way. Don't worry, my unrecoverable errors are real ;) @Disfavor So... If Linux guarantees that, would calling _exit be ok ?Hyperkeratosis
@Hyperkeratosis I still don't understand. You say, "an openssl error". What does that mean exactly? What SSL error exactly? Obviously, if you encounter an error trying to encrypt data, you don't write unencrypted data to disk. That would seem a good reason to do a clean shutdown of your program, not calling exit.Expeditious
OpenSSL error when trying to encrypt data using AES or trying to encrypt the AES key using RSA. That's not the only unrecoverable error I've got, but that's probably the most obvious one. Well... Actually exit is pretty clean... _exit isn't however. The only problem with exit is its thread safety.Hyperkeratosis
Related: https://mcmap.net/q/1470832/-better-replacement-for-exit-atexit-in-c/694576Gibbous
S
4

The manpage states that

The exit() function uses a global variable that is not protected, so it is not thread-safe.

so it won't help, if you are being careful in any way.

But the problem documented is a race condition: MT-Unsafe race:exit

So if you make sure, that exit() can never be called concurrently from two threads, you should be on the safe side! You can make this sure by using a mutex for example.

Surrounding answered 23/7, 2019 at 10:14 Comment(8)
Thanks for you answer ! I'm afraid making sure exit() is never called concurrently can be quite tedious... Unless... What do you think about that ? #define EXIT(status) do { pthread_mutex_lock(&exit_mutex); exit(status); } while(0)Hyperkeratosis
@Hyperkeratosis Yes, of course, just use a mutex. I added that to the answer.Surrounding
That looks weird however. Why is exit not doing that natively ? There must be a reason... The program will stop anyway if you're calling exit, so why not making it thread safe ? I don't get it...Hyperkeratosis
Maybe the reason is, that the standard programming pattern is: start process -> start n threads -> join n threads -> call exit (in the main thread). No need for locks here.Surrounding
I've gave it some thoughts, and I think the answer to that is just that not all programs are multi-threaded, so it wouldn't make sense to lock a mutex natively in the exit functionHyperkeratosis
@Hyperkeratosis instead of #define EXIT(status) do {... you can transform all exits into myexit and implement what ever needs to be done in myexit.Wiegand
Yeah well isn't it kind of the same idea ? Just that instead of wrapping the function at runtime, I do it during the preprocessingHyperkeratosis
It can't be made safe and reliable. You have to stop all your threads.Disfavor
C
1

A modern cross-platform C++ solution could be:

#include <cstdlib>
#include <mutex>

std::mutex exit_mutex;

[[noreturn]] void exit_thread_safe(const int status)
{
    exit_mutex.lock();
    exit(status);
}

The mutex ensures that exit is never called by 2 (or more) different threads.

However, I still question the reason behind even caring about this. How likely is a multi-threaded call to exit() and which bad things can even realistically happen?

EDIT:
Using std::quick_exit avoids the clang diagnostic warning.

Ciro answered 6/7, 2021 at 14:42 Comment(2)
bad thing - clang-tidy diagnostic will appearIndelible
@vatosarmat: Yes, I edited my answer to address this issue more cleanly.Ciro
D
0

It can't be done: even if no data is shared between threads at first, data must be shared between a thread and its cleanup function. The function should run only after the thread has stopped or reached a safe point.

Disfavor answered 23/7, 2019 at 11:53 Comment(15)
Ok but what if you mutex lock the call to exit ? There's not concurrent call to exit, and therefore cleanup thread functions will be sequentially called and the thread safety isn't threatened. At least that's what I think and that's also what @Surrounding seems to think... Can you please elaborate ?Hyperkeratosis
The thread safety of what? Why specifically do you need to call exit and not _exit?Disfavor
I've got a FILE* open during the whole program's lifespan. If I use _exit, buffered data will not be written to the disk. And there are probably other reasons I don't even notice (I don't know by heart the implem of every function in the stdio). But calling exit instead of _exit seems safer, and in my context, I really need safetyHyperkeratosis
The point is that you are doing cleanup in one thread while some other is using that resourceDisfavor
Ok I see what you mean... You're right indeed, but that should be safe as long as I don't share anything between my threads, right ? (I don't have any custom cleanup functions, I'm just assuming the stdio has its own)Hyperkeratosis
You seem to be missing the point: 1) there is sharing between the thread using an stdio global object and the one calling exit; 2) clean exit means running destructors of global objects. You can't do that w/o stopping all threads that can use these object objects.Disfavor
What kind of stdio global object are you talking about ? I don't see any... What do you suggest then ? What about instead of calling exit() in any of my thread, I send a SIGUSR1 to the main thread, kill all the threads, and then call exit ?Hyperkeratosis
Can you show some real world code? You have to use a global somewhere.Disfavor
I use some global variables yeah, but only mutexes and pthread conditions basically... At first sight, nothing that need clean up when exit() is called. And no I can't show any code, sorry :/ Thanks for helping me figuring it out, really appreciated !Hyperkeratosis
You can't have just a mutex and CV, ever. You need some normal variable with state!Disfavor
I still don't understand where you're going with this, there are no destructors in C so no code is executed during exit() unless you explicitly registered a clean up callback using atexit(). The only global variables I have are as I said mutexes, CVs and atomic booleans, no funky stuff that would need clean up when exit() is called. So I don't see how this could be unsafe as long as I don't make concurrent calls to exit() which I prevent with a mutex (in the macro you can see in the edit of my postHyperkeratosis
The point of using exit is that you need some atexit stuff, correct? Some global state is needed to use atexit to do any cleanup. Just forget exit, its MT safety, and also forget atexit. Just simplify the code with a cleanup function. How can cleanup be MT safe? If other threads are still using the resources, it probably can't.Disfavor
No I don't have any atexit(). The point of using exit is to be able to exit my program from any thread. I don't have any clean up functions (and I don't need any), that's what I'm trying to tell you from the beginning ^^ Some functions however register cleanup callbacks for me however (fopen does that I think)Hyperkeratosis
@Hyperkeratosis Please explain how "I call atexit in user code" and "I have library code that might use atexit" is different in any significant way. Except in the later case, you can't even restrict what kind of stuff is done, and you can't analyse to know whether it's MT safe.Disfavor
@Hyperkeratosis "Some functions however register cleanup callbacks for me however (fopen does that I think)" Do you imagine that fopen would not use some kind of global variable? Do you think it doesn't need to be protected by a mutex?Disfavor

© 2022 - 2024 — McMap. All rights reserved.