Why does sem_wait not unblock (and return -1) on an interrupt?
Asked Answered
S

2

9

I have a programme using sem_wait. The Posix specification says:

The sem_wait() function is interruptible by the delivery of a signal.

Additionally, in the section about errors it says:

[EINTR] - A signal interrupted this function.

However, in my programme, sending a signal does not unblock the call (and return -1 as indicated in the spec).

A minimal example can be found below. This programme hangs and sem_wait never unblocks after the signal is sent.

#include <semaphore.h>
#include <pthread.h>
#include <signal.h>
#include <stdio.h>
#include <unistd.h>

sem_t sem;

void sighandler(int sig) {
  printf("Inside sighandler\n");
}

void *thread_listen(void *arg) {
  signal(SIGUSR1, &sighandler);
  printf("sem_wait = %d\n", sem_wait(&sem));
  return NULL;
}

int main(void) {

  pthread_t thread;

  sem_init(&sem, 0, 0); 

  pthread_create(&thread, NULL, &thread_listen, NULL);

  sleep(1);
  raise(SIGUSR1);

  pthread_join(thread, NULL);

  return 0;
}

The programme outputs Inside sighandler then hangs.

There is another question here about this, but it doesn't really provide any clarity.

Am I misunderstanding what the spec says? FYI my computer uses Ubuntu GLIBC 2.31-0ubuntu9.

Shere answered 26/4, 2020 at 22:23 Comment(1)
If your goal is to just wake up a thread, the pthread_cond_t (condition variable) is a much more appropriate contract. Looks like you already got the right answer below anyway.Lex
R
11

There are three reasons why this program doesn't behave as you expect, only two of which are fixable.

  1. As pointed out in David Schwartz’s answer, in a multi-threaded program, raise sends a signal to the thread that calls raise.

    To get the signal sent to the thread you wanted, in this test program, change the raise(SIGUSR1) to pthread_kill(thread, SIGUSR1). However, if you want that specific thread to handle SIGUSR1 when it’s sent to the entire process, what you need to do is use pthread_sigmask to block SIGUSR1 in all of the threads except the one that's supposed to handle it. (See below for more detail on this.)

  2. On systems that use glibc, signal installs a signal handler that does not interrupt blocking system calls. To get a signal handler that does, you need to use sigaction and set sa_flags to a value that doesn’t include SA_RESTART. For instance,

      struct sigaction sa;
      sigemptyset(&sa.sa_mask);
      sa.sa_handler = sighandler;
      sa.sa_flags = 0;
      sigaction(SIGUSR1, &sa, 0);
    

    Note: memset(&sa, 0, sizeof sa) is not guaranteed to have the same effect as sigemptyset(&sa.sa_mask).

    Note: Signal handlers are process-global, so it doesn’t matter which thread you call sigaction on. In almost all cases, multithreaded programs should do all their sigaction calls in main before creating any threads, just to make sure the signal handlers are active before any signals can happen.

  3. The signal could be delivered to the thread before the thread has a chance to call sem_wait. If that happens, the signal handler will be called and return, and then sem_wait will be called and it will block forever. In this test program, you can make this arbitrarily unlikely by increasing the length of the sleep in main, but there is no way to make it impossible. This is the unfixable reason.

    There are a small number of system calls that atomically unblock signals while sleeping, and then block them again before returning to user space, such as sigsuspend, sigwaitinfo, and pselect. These are the only system calls for which this race condition can be avoided.

    Best practice for a multi-threaded program that has to deal with signals is to have one thread devoted to signal handling. To make that work reliably, you should block all signals except for synchronous CPU exceptions (SIGABRT, SIGBUS, SIGFPE, SIGILL, SIGSEGV, SIGSYS, and SIGTRAP) at the very beginning of main, before creating any threads. Then you set a do-nothing signal handler (with SA_RESTART) for the signals you want to handle; these will never actually be called, their purpose is to prevent the kernel from killing the process due to the default action of SIGUSR1 or whatever. The set of signals you care about must include all of the signals for user interrupts: SIGHUP, SIGINT, SIGPWR, SIGQUIT, SIGTERM, SIGTSTP, SIGXCPU, SIGXFSZ. Finally, you create the signal-handling thread, which loops calling sigwaitinfo for the appropriate set of signals, and dispatches messages to the rest of the threads using pipes or condition variables or anything but signals really. This thread must never block in any system call other than sigwaitinfo.

    In the case of this test program, the signal-handling thread would respond to SIGUSR1 by calling sem_post(&sem). This would either wake up the listener thread, or it would cause the listener thread not to become blocked on sem_wait in the first place.

Raddatz answered 26/4, 2020 at 22:40 Comment(0)
F
4

In a multi-threaded program, raise sends a signal to the thread that calls raise. You need to use kill(getpid(), ...) or pthread_signal(thread, ...).

Frith answered 26/4, 2020 at 22:32 Comment(4)
The man page for raise says it is equivalent to kill(getpid(), sig). Additionally, I know the thread is getting the signal because Inside sighandler is printed.Shere
That part of the manpage is only accurate for single-threaded programs. Yes, it should be corrected.Raddatz
@Shere If your man page says that, it is wrong. Can you find it online somewhere and give me a link so I can figure out who to context to get it fixed? (I'm not sure why you think seeing "Inside sighandler" proves which thread got the signal. On your platform, signal probably sets the process signal handler for all threads that don't explicitly override it. But that's just a guess since signal's behavior isn't defined in this context and can vary.)Frith
My bad, the man page specifies that only applies to a signal-threaded programme.Shere

© 2022 - 2024 — McMap. All rights reserved.