How to get proper backtrace in process signal handler (armv7-uclibc)?
Asked Answered
U

3

8

I already did google many times to find right solution for backtrace() in signal handler and tried almost everything but I was not able to get the backtrace successfully in my signal handler - this is not SIGUSR1 handler.

  • enable UCLIBC_HAS_BACKTRACE=y in uclibc config and compiled it
  • verified that libubacktrace.so is created
  • compiled my application binaries with following options -g -rdynamic -fexception or -funwind-tables
  • The binary itself seems to be "stripped"

However, I was not able to get full backtrace from signal handler. Only function addresses which I've call in signal handler were printed.

If I use target-gdb binary and attach the process by using gdb --pid command, I was able to get the full backtrace properly.

Also, I tried pstack but (pstack-1.2 - tried arm-patch but it's horrible... nothing printed) not very helpful.

Any advice?


1) Compiler options in Makefile

CFLAGS += -g -fexceptions -funwind-tables -Werror $(WARN) ...

2) Code

The code is extremely simple.

#define CALLSTACK_SIZE 10

static void print_stack(void) {
    int i, nptrs;
    void *buf[CALLSTACK_SIZE + 1];
    char **strings;

    nptrs = backtrace(buf, CALLSTACK_SIZE);
    printf("%s: backtrace() returned %d addresses\n", __func__, nptrs);

    strings = backtrace_symbols(buf, nptrs);

    if(strings == NULL) {
        printf("%s: no backtrace captured\n", __func__);
        return;
    }

    for(i = 0; i < nptrs; i++) {
        printf("%s\n", strings[i]);
    }

    free(strings);
}

...
static void sigHandler(int signum)
{
    printf("%s: signal %d\n", __FUNCTION__, signum);
    switch(signum ) {
    case SIGUSR2:
        // told to quit
        print_stack();
        break;
    default:
        break;
    }
}
Unboned answered 1/5, 2015 at 6:36 Comment(6)
You need to show some code, in particular your signal handler code. Please edit your question to improve it.Condole
What kind of application are you coding? Do you have some event loop?Condole
kind of... this process is waiting for "message" from other process. SysV IPC messages are set up among multiple processes and they are posting the messages each other. Other than that, not much special operation... No socket I/O. It's updating character device by calling open() and write() but only writing 1 bytes.Unboned
Why don't you use AF_UNIX sockets? Or pipes, or FIFOs?Condole
Could you let me know how it can help? Anyway, this is not my code and I need to debug some issue in this application process. That's why I'm trying to setup the signal handler and see what's going on.... Anyway, many thanks!Unboned
There is no fail-proof way, since you are violating the async signal safe requirement. You might try Ian Taylor libbacktrace but you really should redesign & refactor your application to obey to signal(7) requirements. For debugging purposes, using gdb is much simpler!Condole
C
10

Read carefully signal(7) and signal-safety(7).

A signal handler is restricted to call (directly or indirectly) only async-signal-safe-functions (practically speaking, most syscalls(2) only) and backtrace(3) or even printf(3) or malloc(3) or free are not async-signal-safe. So your code is incorrect: the signal handler sigHandler is calling printf and indirectly (thru print_stack) free and they are not async-signal-safe.

So your only option is to use the gdb debugger.

Read more about POSIX signal.h & signal concepts. Practically speaking, the nearly only sensible thing a signal handler can do is set some global, thread-local, or static volatile sig_atomic_t flag, which has to be tested elsewhere. It could also directly write(2) a few bytes into a pipe(7), that your application would read elsewhere (e.g. in its event loop, if it is a GUI application).

You could also use Ian Taylor's libbacktrace from inside GCC (assuming your program is compiled with debug info, e.g. with -g). It is not guaranteed to work in signal handlers (since it is not using only async-signal-safe functions), but it is practically quite useful.

Notice that the kernel is setting a call frame (in the call stack) for sigreturn(2) when processing a signal.

You might also use (especially if your application is single-threaded) sigaltstack(2) to have an alternate signal stack. I'm not sure it would be helpful.

If you have an event loop, you might consider using the Linux specific signalfd(2) and ask your event loop to poll it. For SIGTERM or SIGQUIT or SIGALRM it is a quite useful trick.

Condole answered 1/5, 2015 at 6:40 Comment(1)
Using signalfd() would completely defeat the purpose of trying to get backtrace in signal handler because at that point your backtrace would always 100% point to the function you call from the poll() loop.Strauss
D
12

I would like to add something to @Basile Starynkevitch's answer, which is overly pedantic. While it's true that your signal handler isn't async-signal-safe, there's a good chance it will often work on Linux, so if you are seeing results being printed out, that isn't what's causing your issue of not seeing relevant stack information.

Some more likely problems include:

  1. Incorrect compiler flags for your platform. backtraces often work fine on x86 without special flags, but ARM can be more finicky. There are a few that I've tried that I can't remember, but the most important ones to try are -fno-omit-frame-pointer and -fasynchronous-unwind-tables. You can also try -fnon-call-exceptions, which is a weaker version of -fasynchronous-unwind-tables that should be sufficient for tracing crashes, at least on 64 bit ARM. On 32 bit ARM you will likely have to use -fnon-call-exceptions since -fasynchronous-unwind-tables isn't currently implemented (as far as I know).

  2. The code that's crashing was called through code that wasn't compiled with correct flags for getting stack traces. For example, stack traces that originate in code that calls back from a .so that wasn't compiled with correct compiler flags will often result in duplicate or truncated backtraces.

  3. The signal that you are getting the backtrace for is not a thread-directed signal, but a process-directed one. Practically speaking a thread-directed signal is one like SIGSEGV when the thread crashes, or one that another thread sends a specific thread with something like pthread_kill. See man 7 signal for more information.

With that out of the way, I would like to address what you can be doing in your signal handler to get backtraces. It is true that you shouldn't be calling any stdio functions, malloc(), free(), etc., but it is not true that you can't call backtrace with a sane version of glibc/libgcc. From here, you can see that backtrace_symbols_fd is currently async-signal-safe. You can also see that backtrace is not. It looks very unsafe. However, man 3 backtrace tell us why these restrictions apply:

backtrace_symbols_fd() does not call malloc(3), and so can be employed in situations where the latter function might fail, but see NOTES.

Later:

backtrace() and backtrace_symbols_fd() don't call malloc() explicitly, but they are part of libgcc, which gets loaded dynamically when first used. Dynamic loading usually triggers a call to malloc(3). If you need certain calls to these two functions to not allocate memory (in signal handlers, for example), you need to make sure libgcc is loaded beforehand.

A quick look at the source for backrace confirms that the unsafe parts involve dynamically loading libgcc. You could get around this by statically linking both glibc and libgcc, but the most robust way of doing it is by making sure that libgcc is loaded before any signals are generated.

The way I do this is by calling backtrace once during program startup. Note that you must ask for at least one symbol; otherwise, the function early-outs without loading libgcc. Something like this should work:

// On linux, especially on ARM, you want to use the sigaction version of this call.
// See my comments below.
static void
handle_signal(int sig)
{
    // Check signal type or whatever you want to do.
    // ...
    
    void* symbols[100];
    int n = backtrace(symbols, 100);
    
    // You could also either call a string formatting routine that you know
    // is async-signal-safe or save your backtrace and let another thread know
    // that this thread has crashed and the backtrace needs to be printed.
    //
    write(STDERR_FILENO, "Crash:\n", 7);
    backtrace_symbols_fd(symbols, n, STDERR_FILENO);

    // In the case of notifying another thread, which is what I do, you would
    // do something like this:
    //
    // threadLocalSymbolCount = backtrace(threadLocalSymbols, 100);
    // sem_post() or write() to an eventfd or whatever.
}

int main(int argc, char** argv)
{
    void* dummy = NULL;
    backtrace(&dummy, 1);
    
    // Setup custom signal handling
    // ...

    function_that_crashes();

    return 0;
}

EDIT: The OP mentions that they are using uclibc instead of glibc, but the same arguments apply, since it loads libgcc dynamically to get backtraces as well. An interesting point is that the source for uclibc's bactrace mentions that -fasynchronous-unwind-tables is necessary.

Dagostino answered 17/5, 2020 at 23:35 Comment(0)
C
10

Read carefully signal(7) and signal-safety(7).

A signal handler is restricted to call (directly or indirectly) only async-signal-safe-functions (practically speaking, most syscalls(2) only) and backtrace(3) or even printf(3) or malloc(3) or free are not async-signal-safe. So your code is incorrect: the signal handler sigHandler is calling printf and indirectly (thru print_stack) free and they are not async-signal-safe.

So your only option is to use the gdb debugger.

Read more about POSIX signal.h & signal concepts. Practically speaking, the nearly only sensible thing a signal handler can do is set some global, thread-local, or static volatile sig_atomic_t flag, which has to be tested elsewhere. It could also directly write(2) a few bytes into a pipe(7), that your application would read elsewhere (e.g. in its event loop, if it is a GUI application).

You could also use Ian Taylor's libbacktrace from inside GCC (assuming your program is compiled with debug info, e.g. with -g). It is not guaranteed to work in signal handlers (since it is not using only async-signal-safe functions), but it is practically quite useful.

Notice that the kernel is setting a call frame (in the call stack) for sigreturn(2) when processing a signal.

You might also use (especially if your application is single-threaded) sigaltstack(2) to have an alternate signal stack. I'm not sure it would be helpful.

If you have an event loop, you might consider using the Linux specific signalfd(2) and ask your event loop to poll it. For SIGTERM or SIGQUIT or SIGALRM it is a quite useful trick.

Condole answered 1/5, 2015 at 6:40 Comment(1)
Using signalfd() would completely defeat the purpose of trying to get backtrace in signal handler because at that point your backtrace would always 100% point to the function you call from the poll() loop.Strauss
Z
0

Maybe you can try boost::stacktrace::safe_dump_to

Possible code can be (fprintf and dladdr seems ok)

void signal_safe_dump_bt_buf(uint64_t *bt_buf, size_t bt_cnt) {
  for (size_t i = 0; i < bt_cnt; i++) {
    uint64_t addr = bt_buf[i];
    Dl_info info;
    if (dladdr((void *)addr, &info) != 0) {
      if (info.dli_saddr) {
        fprintf(stderr, "0x%016lx: %s (offset 0x%lx) at %s\n",
                addr - (uint64_t)info.dli_fbase,
                info.dli_sname ? info.dli_sname : "?",
                (uint64_t)info.dli_saddr - (uint64_t)info.dli_fbase,
                info.dli_fname ? info.dli_fname : "?");
      } else {
        fprintf(stderr, "0x%016lx: %s at %s\n", addr - (uint64_t)info.dli_fbase,
                info.dli_sname ? info.dli_sname : "?",
                info.dli_fname ? info.dli_fname : "?");
      }
    }
  }
}

void signal_safe_dump_bt() {
  size_t max_bt_count = 100;
  uint64_t bt_buf[max_bt_count];
  size_t bt_cnt =
      boost::stacktrace::safe_dump_to(bt_buf, sizeof(decltype(bt_buf)));

  signal_safe_dump_bt_buf(bt_buf, bt_cnt);
}

A related blog is here

Zee answered 3/2, 2023 at 22:17 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.