I would like to add something to @Basile Starynkevitch's answer, which is overly pedantic.
While it's true that your signal handler isn't async-signal-safe, there's a good chance it will often work on Linux, so if you are seeing results being printed out, that isn't what's causing your issue of not seeing relevant stack information.
Some more likely problems include:
Incorrect compiler flags for your platform. backtraces often work fine on x86 without special flags, but ARM can be more finicky. There are a few that I've tried that I can't remember, but the most important ones to try are -fno-omit-frame-pointer
and -fasynchronous-unwind-tables
. You can also try -fnon-call-exceptions
, which is a weaker version of -fasynchronous-unwind-tables
that should be sufficient for tracing crashes, at least on 64 bit ARM. On 32 bit ARM you will likely have to use -fnon-call-exceptions
since -fasynchronous-unwind-tables
isn't currently implemented (as far as I know).
The code that's crashing was called through code that wasn't compiled with correct flags for getting stack traces. For example, stack traces that originate in code that calls back from a .so
that wasn't compiled with correct compiler flags will often result in duplicate or truncated backtraces.
The signal that you are getting the backtrace for is not a thread-directed signal, but a process-directed one. Practically speaking a thread-directed signal is one like SIGSEGV
when the thread crashes, or one that another thread sends a specific thread with something like pthread_kill
. See man 7 signal for more information.
With that out of the way, I would like to address what you can be doing in your signal handler to get backtraces. It is true that you shouldn't be calling any stdio functions, malloc()
, free()
, etc., but it is not true that you can't call backtrace
with a sane version of glibc/libgcc. From here, you can see that backtrace_symbols_fd
is currently async-signal-safe. You can also see that backtrace
is not. It looks very unsafe. However, man 3 backtrace tell us why these restrictions apply:
backtrace_symbols_fd() does not call malloc(3), and
so can be employed in situations where the latter function might
fail, but see NOTES.
Later:
backtrace() and backtrace_symbols_fd() don't call malloc()
explicitly, but they are part of libgcc, which gets loaded
dynamically when first used. Dynamic loading usually triggers a
call to malloc(3). If you need certain calls to these two
functions to not allocate memory (in signal handlers, for
example), you need to make sure libgcc is loaded beforehand.
A quick look at the source for backrace confirms that the unsafe parts involve dynamically loading libgcc
. You could get around this by statically linking both glibc
and libgcc
, but the most robust way of doing it is by making sure that libgcc
is loaded before any signals are generated.
The way I do this is by calling backtrace
once during program startup. Note that you must ask for at least one symbol; otherwise, the function early-outs without loading libgcc. Something like this should work:
// On linux, especially on ARM, you want to use the sigaction version of this call.
// See my comments below.
static void
handle_signal(int sig)
{
// Check signal type or whatever you want to do.
// ...
void* symbols[100];
int n = backtrace(symbols, 100);
// You could also either call a string formatting routine that you know
// is async-signal-safe or save your backtrace and let another thread know
// that this thread has crashed and the backtrace needs to be printed.
//
write(STDERR_FILENO, "Crash:\n", 7);
backtrace_symbols_fd(symbols, n, STDERR_FILENO);
// In the case of notifying another thread, which is what I do, you would
// do something like this:
//
// threadLocalSymbolCount = backtrace(threadLocalSymbols, 100);
// sem_post() or write() to an eventfd or whatever.
}
int main(int argc, char** argv)
{
void* dummy = NULL;
backtrace(&dummy, 1);
// Setup custom signal handling
// ...
function_that_crashes();
return 0;
}
EDIT: The OP mentions that they are using uclibc instead of glibc, but the same arguments apply, since it loads libgcc dynamically to get backtraces as well. An interesting point is that the source for uclibc's bactrace mentions that -fasynchronous-unwind-tables
is necessary.
AF_UNIX
sockets? Or pipes, or FIFOs? – Condolelibbacktrace
but you really should redesign & refactor your application to obey tosignal(7)
requirements. For debugging purposes, usinggdb
is much simpler! – Condole