Interruptible thread join in Python
Asked Answered
D

5

27

Is there any way to wait for termination of a thread, but still intercept signals?

Consider the following C program:

#include <signal.h>
#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>
#include <pthread.h>
#include <stdlib.h>

void* server_thread(void* dummy) {
    sleep(10);
    printf("Served\n");
    return NULL;
}

void* kill_thread(void* dummy) {
    sleep(1); // Let the main thread join
    printf("Killing\n");
    kill(getpid(), SIGUSR1);
    return NULL;
}

void handler(int signum) {
    printf("Handling %d\n", signum);
    exit(42);
}

int main() {
    pthread_t servth;
    pthread_t killth;

    signal(SIGUSR1, handler);

    pthread_create(&servth, NULL, server_thread, NULL);
    pthread_create(&killth, NULL, kill_thread, NULL);

    pthread_join(servth, NULL);

    printf("Main thread finished\n");
    return 0;
}

It ends after one second and prints:

Killing
Handling 10

In contrast, here's my attempt to write it in Python:

#!/usr/bin/env python
import signal, time, threading, os, sys

def handler(signum, frame):
    print("Handling " + str(signum) + ", frame:" + str(frame))
    exit(42)
signal.signal(signal.SIGUSR1, handler)

def server_thread():
    time.sleep(10)
    print("Served")
servth = threading.Thread(target=server_thread)
servth.start()

def kill_thread():
    time.sleep(1) # Let the main thread join
    print("Killing")
    os.kill(os.getpid(), signal.SIGUSR1)
killth = threading.Thread(target=kill_thread)
killth.start()

servth.join()

print("Main thread finished")

It prints:

Killing
Served
Handling 10, frame:<frame object at 0x12649c0>

How do I make it behave like the C version?

Daggerboard answered 10/3, 2009 at 17:28 Comment(2)
gcc -pthread thread.c is the way to compile C source if anyone faced errors like me by trying gcc thread.c alone.Kellerman
Related question: python - threading ignores KeyboardInterrupt exception - Stack Overflow, linux - Cannot kill Python script with Ctrl-C - Stack OverflowBaudoin
D
6

Jarret Hardie already mentioned it: According to Guido van Rossum, there's no better way as of now: As stated in the documentation, join(None) blocks (and that means no signals). The alternative - calling with a huge timeout (join(2**31) or so) and checking isAlive looks great. However, the way Python handles timers is disastrous, as seen when running the python test program with servth.join(100) instead of servth.join():

select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 2000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 4000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 8000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 16000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 32000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 50000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 50000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 50000}) = 0 (Timeout)
--- Skipped 15 equal lines ---
select(0, NULL, NULL, NULL, {0, 50000}Killing

I.e., Python wakes up every 50 ms, leading to a single application keeping the CPU from sleeping.

Daggerboard answered 11/3, 2009 at 12:4 Comment(3)
In Python 3, servth.join() blocked on lock.acquire() can be interrupted by a signal.Reprehensible
i am posting to concur with @Daggerboard (in python 2.6 centos-6): i was expecting to receive signals when my main thread does a if thread.is_alive(): thread.join() ... but unfortunately I did not receive the signal. strangely when i do a while thread.is_alive(): thread.join(30.0) i do receive the signal as expected. --> in summary I found the same behavior that @Daggerboard found (i.e. you have to do use thread.join with a timeout... otherwise you can't receive signals).Hussein
here's another answer that concurs you have to use a thread join with a timeoutHussein
P
15

Threads in Python are somewhat strange beasts given the global interpreter lock. You may not be able to achieve what you want without resorting to a join timeout and isAlive as eliben suggests.

There are two spots in the docs that give the reason for this (and possibly more).

The first:

From http://docs.python.org/library/signal.html#module-signal:

Some care must be taken if both signals and threads are used in the same program. The fundamental thing to remember in using signals and threads simultaneously is: always perform signal() operations in the main thread of execution. Any thread can perform an alarm(), getsignal(), pause(), setitimer() or getitimer(); only the main thread can set a new signal handler, and the main thread will be the only one to receive signals (this is enforced by the Python signal module, even if the underlying thread implementation supports sending signals to individual threads). This means that signals can’t be used as a means of inter-thread communication. Use locks instead.

The second, from http://docs.python.org/library/thread.html#module-thread:

Threads interact strangely with interrupts: the KeyboardInterrupt exception will be received by an arbitrary thread. (When the signal module is available, interrupts always go to the main thread.)

EDIT: There was a decent discussion of the mechanics of this on the python bug tracker here: http://bugs.python.org/issue1167930. Of course, it ends with Guido saying: " That's unlikely to go away, so you'll just have to live with this. As you've discovered, specifying a timeout solves the issue (sort of)." YMMV :-)

Pendulum answered 10/3, 2009 at 18:12 Comment(2)
Well, I am calling signal.signal in the main thread(1), and the signal module is available(2).Daggerboard
Right, but the signal will only go to the main thread, so you have to wait for servth to join before the signal will go to the signal handler (via the main thread). Confusing, no?Pendulum
D
6

Jarret Hardie already mentioned it: According to Guido van Rossum, there's no better way as of now: As stated in the documentation, join(None) blocks (and that means no signals). The alternative - calling with a huge timeout (join(2**31) or so) and checking isAlive looks great. However, the way Python handles timers is disastrous, as seen when running the python test program with servth.join(100) instead of servth.join():

select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 2000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 4000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 8000})  = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 16000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 32000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 50000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 50000}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {0, 50000}) = 0 (Timeout)
--- Skipped 15 equal lines ---
select(0, NULL, NULL, NULL, {0, 50000}Killing

I.e., Python wakes up every 50 ms, leading to a single application keeping the CPU from sleeping.

Daggerboard answered 11/3, 2009 at 12:4 Comment(3)
In Python 3, servth.join() blocked on lock.acquire() can be interrupted by a signal.Reprehensible
i am posting to concur with @Daggerboard (in python 2.6 centos-6): i was expecting to receive signals when my main thread does a if thread.is_alive(): thread.join() ... but unfortunately I did not receive the signal. strangely when i do a while thread.is_alive(): thread.join(30.0) i do receive the signal as expected. --> in summary I found the same behavior that @Daggerboard found (i.e. you have to do use thread.join with a timeout... otherwise you can't receive signals).Hussein
here's another answer that concurs you have to use a thread join with a timeoutHussein
C
3

Poll on isAlive before calling join. This polling can be interrupted, of course, and once the thread isn't isAlive, join is immediate.

An alternative would be polling on join with a timeout, checking with isAlive whether the timeout occurred. This can spend less CPU than the previous method.

Carn answered 10/3, 2009 at 17:43 Comment(4)
Sure, polling works, but it uses more resources and prevents CPU sleep state which can be quite costly on notebooks. I'm looking for another solution.Daggerboard
yes, but using the second method you don't waste CPU, because join with timeout blocks and releases it. so even a relatively small timeout of a few dozens of milliseconds will leave you 99.9% of CPU freeCarn
eliben: 99.9% CPU free is not in the least desirable if the work is spread evenly, I'd very much prefer 80% CPU free in a single burst for a desktop application. See lesswatts.org/projects/applications-power-management/… for details.Daggerboard
lesswatts[.]org is a browser extension pusher now - See archived instead.Banquet
H
1

As far as I understand, a similar question is solved in The Little Book of Semaphores (free download), appendix A part 3…

Hammett answered 21/2, 2011 at 8:47 Comment(1)
That's quite a hack though, essentially replacing intra- with inter-process communicationDaggerboard
L
0

I know I'm a bit late to the party, but I came to this question hoping for a better answer than joining with a timeout, which I was already doing. In the end I cooked something up that may or may not be a horrible bastardisation of signals, but it involves using signal.pause() instead of Thread.join() and signalling the current process when the thread reaches the end of its execution:

import signal, os, time, sys, threading, random

threadcount = 200

threadlock = threading.Lock()
pid = os.getpid()
sigchld_count = 0

def handle_sigterm(signalnum, frame):
    print "SIGTERM"

def handle_sigchld(signalnum, frame):
    global sigchld_count
    sigchld_count += 1

def faux_join():
    global threadcount, threadlock
    threadlock.acquire()
    threadcount -= 1
    threadlock.release()
    os.kill(pid, signal.SIGCHLD)

def thread_doer():
    time.sleep(2+(2*random.random()))
    faux_join()

if __name__ == '__main__':
    signal.signal(signal.SIGCHLD, handle_sigchld)
    signal.signal(signal.SIGTERM, handle_sigterm)

    print pid
    for i in xrange(0, threadcount):
        t = threading.Thread(target=thread_doer)
        t.start()

    while 1:
        if threadcount == 0: break
        signal.pause()
        print "Signal unpaused, thread count %s" % threadcount

    print "All threads finished"
    print "SIGCHLD handler called %s times" % sigchld_count

If you want to see the SIGTERMs in action, extend the length of the sleep time in thread_doer and issue a kill $pid command from another terminal, where $pid is the pid id printed at the start.

I post this as much in the hope of helping others as being told that this is crazy or has a bug. I'm not sure if the lock on threadcount is still necessary - I put it in there early in my experimentation and thought I should leave it in there in case.

Leopard answered 21/2, 2011 at 7:53 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.