Best way to invoke gdb from inside program to print its stacktrace?

Asked 30/6, 2010 at 17:27 Answered 30/6, 2010 at 18:30

Using a function like this:

#include <stdio.h>
#include <stdlib.h>
#include <sys/wait.h>
#include <unistd.h>

void print_trace() {
    char pid_buf[30];
    sprintf(pid_buf, "--pid=%d", getpid());
    char name_buf[512];
    name_buf[readlink("/proc/self/exe", name_buf, 511)]=0;
    int child_pid = fork();
    if (!child_pid) {           
        dup2(2,1); // redirect output to stderr
        fprintf(stdout,"stack trace for %s pid=%s\n",name_buf,pid_buf);
        execlp("gdb", "gdb", "--batch", "-n", "-ex", "thread", "-ex", "bt", name_buf, pid_buf, NULL);
        abort(); /* If gdb failed to start */
    } else {
        waitpid(child_pid,NULL,0);
    }
}

I see the details of print_trace in the output.

What are other ways to do it?

Chromogenic answered 30/6, 2010 at 17:28 Comment(17)

Is there a problem with it? Something it doesn't do? – Ediva 30/6, 2010 at 17:37

@Adam Shiemke Some problems listed. May be gdb can be invoked in more proper way. May be I need something special to support multithreading. May be there's way to make it portable or there is special "libstacktrace.so". Discussion. – Chromogenic 30/6, 2010 at 18:4

You can use the -ex option more than once. – Dagney 30/6, 2010 at 18:52

@Derek Ledbetter, OK, applying. – Chromogenic 30/6, 2010 at 19:3

Do not use the above function! It seems to crash the X Server on Ubuntu 12.04. – Damocles 13/8, 2012 at 9:13

Gdb things crashes X server? Are you trying to debug X server? What messages do you see in Xorg.0.log? You should probably file a bug report if X server crashes if you just debug some other program in console. – Chromogenic 13/8, 2012 at 12:19

@Vi: gdb called by this function crashes it (or better said freezes it). And only on Ubuntu 12.04, which I don't have. But I got reports by 2 or 3 users who compiled my program from svn that the X Server "crashes", unless they remove that function – Damocles 14/8, 2012 at 22:27

You mean X server itself that is compiled with this function inserted into the X server's source code? Or other application that is using X server? – Chromogenic 15/8, 2012 at 20:22

@Chromogenic Just a normal (qt gui) application. – Damocles 17/8, 2012 at 8:50

If the "freeze" is a sort of captured input (e.g like when some menu is opened: can't click away, but mouse is still moving and clock in the tray is still updating it's time), this is probably some bug in X server; you should file a bug about it. X server is running as root, Qt gui applications are usually running as user. User's application should not be able to crash root's X server (with or without gdb or whatever). Also I used this function myself to debug Qt applications and it worked more or less well. – Chromogenic 17/8, 2012 at 13:21

Meant "if the freeze is not sort of..., this is probably some bug". – Chromogenic 17/8, 2012 at 17:35

My gdb requires "--pid" before specifying the pid. So it would be changed to: execlp("gdb", "gdb", "--batch", "-n", "-ex", "thread", "-ex", "bt", name_buf, "--pid=", pid_buf, NULL); – Veda 6/9, 2012 at 14:52

And it is getting worse: ptracing the parent is now no longer permitted. But perhaps there is a flag you can set with prctl? – Damocles 14/6, 2013 at 12:39

Only direct parent or any process in path towards init? restrict ptrace scope to children -> How to reptyr then? How to strace -p and gdb .. pid and scanmem?.. The limit is too intrusive to be just enabled by default on non-security-hardened distros. /* I also sometimes just stracing the whole system */ – Chromogenic 14/6, 2013 at 18:51

C-only version: #106159 – Assegai 2/7, 2015 at 18:40

As a side note, there's a wrapper library for attaching GDB: libdebugme. – Substitutive 25/8, 2018 at 12:33

FYI the code in this question no longer works, but I've edited the answer here with working code, based on the answer's comments. – Nord 22/11, 2020 at 20:38

102

You mentioned on my other answer (now deleted) that you also want to see line numbers. I'm not sure how to do that when invoking gdb from inside your application.

But I'm going to share with you a couple of ways to print a simple stacktrace with function names and their respective line numbers without using gdb. Most of them came from a very nice article from Linux Journal:

Method #1:

The first method is to disseminate it with print and log messages in order to pinpoint the execution path. In a complex program, this option can become cumbersome and tedious even if, with the help of some GCC-specific macros, it can be simplified a bit. Consider, for example, a debug macro such as:

 #define TRACE_MSG fprintf(stderr, __FUNCTION__     \
                          "() [%s:%d] here I am\n", \
                          __FILE__, __LINE__)

You can propagate this macro quickly throughout your program by cutting and pasting it. When you do not need it anymore, switch it off simply by defining it to no-op.

Method #2: (It doesn't say anything about line numbers, but I do on method 4)

A nicer way to get a stack backtrace, however, is to use some of the specific support functions provided by glibc. The key one is backtrace(), which navigates the stack frames from the calling point to the beginning of the program and provides an array of return addresses. You then can map each address to the body of a particular function in your code by having a look at the object file with the nm command. Or, you can do it a simpler way--use backtrace_symbols(). This function transforms a list of return addresses, as returned by backtrace(), into a list of strings, each containing the function name offset within the function and the return address. The list of strings is allocated from your heap space (as if you called malloc()), so you should free() it as soon as you are done with it.

I encourage you to read it since the page has source code examples. In order to convert an address to a function name you must compile your application with the -rdynamic option.

Method #3: (A better way of doing method 2)

An even more useful application for this technique is putting a stack backtrace inside a signal handler and having the latter catch all the "bad" signals your program can receive (SIGSEGV, SIGBUS, SIGILL, SIGFPE and the like). This way, if your program unfortunately crashes and you were not running it with a debugger, you can get a stack trace and know where the fault happened. This technique also can be used to understand where your program is looping in case it stops responding

An implementation of this technique is available here.

Method #4:

A small improvement I've done on method #3 to print line numbers. This could be copied to work on method #2 also.

Basically, I followed a tip that uses addr2line to

convert addresses into file names and line numbers.

The source code below prints line numbers for all local functions. If a function from another library is called, you might see a couple of ??:0 instead of file names.

#include <stdio.h>
#include <signal.h>
#include <stdio.h>
#include <signal.h>
#include <execinfo.h>

void bt_sighandler(int sig, struct sigcontext ctx) {

  void *trace[16];
  char **messages = (char **)NULL;
  int i, trace_size = 0;

  if (sig == SIGSEGV)
    printf("Got signal %d, faulty address is %p, "
           "from %p\n", sig, ctx.cr2, ctx.eip);
  else
    printf("Got signal %d\n", sig);

  trace_size = backtrace(trace, 16);
  /* overwrite sigaction with caller's address */
  trace[1] = (void *)ctx.eip;
  messages = backtrace_symbols(trace, trace_size);
  /* skip first stack frame (points here) */
  printf("[bt] Execution path:\n");
  for (i=1; i<trace_size; ++i)
  {
    printf("[bt] #%d %s\n", i, messages[i]);

    /* find first occurence of '(' or ' ' in message[i] and assume
     * everything before that is the file name. (Don't go beyond 0 though
     * (string terminator)*/
    size_t p = 0;
    while(messages[i][p] != '(' && messages[i][p] != ' '
            && messages[i][p] != 0)
        ++p;

    char syscom[256];
    sprintf(syscom,"addr2line %p -e %.*s", trace[i], p, messages[i]);
        //last parameter is the file name of the symbol
    system(syscom);
  }

  exit(0);
}


int func_a(int a, char b) {

  char *p = (char *)0xdeadbeef;

  a = a + b;
  *p = 10;  /* CRASH here!! */

  return 2*a;
}


int func_b() {

  int res, a = 5;

  res = 5 + func_a(a, 't');

  return res;
}


int main() {

  /* Install our signal handler */
  struct sigaction sa;

  sa.sa_handler = (void *)bt_sighandler;
  sigemptyset(&sa.sa_mask);
  sa.sa_flags = SA_RESTART;

  sigaction(SIGSEGV, &sa, NULL);
  sigaction(SIGUSR1, &sa, NULL);
  /* ... add any other signal here */

  /* Do something */
  printf("%d\n", func_b());
}

This code should be compiled as: gcc sighandler.c -o sighandler -rdynamic

The program outputs:

Got signal 11, faulty address is 0xdeadbeef, from 0x8048975
[bt] Execution path:
[bt] #1 ./sighandler(func_a+0x1d) [0x8048975]
/home/karl/workspace/stacktrace/sighandler.c:44
[bt] #2 ./sighandler(func_b+0x20) [0x804899f]
/home/karl/workspace/stacktrace/sighandler.c:54
[bt] #3 ./sighandler(main+0x6c) [0x8048a16]
/home/karl/workspace/stacktrace/sighandler.c:74
[bt] #4 /lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xe6) [0x3fdbd6]
??:0
[bt] #5 ./sighandler() [0x8048781]
??:0

Update 2012/04/28 for recent linux kernel versions, the above sigaction signature is obsolete. Also I improved it a bit by grabbing the executable name from this answer. Here is an up to date version:

char* exe = 0;

int initialiseExecutableName() 
{
    char link[1024];
    exe = new char[1024];
    snprintf(link,sizeof link,"/proc/%d/exe",getpid());
    if(readlink(link,exe,sizeof link)==-1) {
        fprintf(stderr,"ERRORRRRR\n");
        exit(1);
    }
    printf("Executable name initialised: %s\n",exe);
}

const char* getExecutableName()
{
    if (exe == 0)
        initialiseExecutableName();
    return exe;
}

/* get REG_EIP from ucontext.h */
#define __USE_GNU
#include <ucontext.h>

void bt_sighandler(int sig, siginfo_t *info,
                   void *secret) {

  void *trace[16];
  char **messages = (char **)NULL;
  int i, trace_size = 0;
  ucontext_t *uc = (ucontext_t *)secret;

  /* Do something useful with siginfo_t */
  if (sig == SIGSEGV)
    printf("Got signal %d, faulty address is %p, "
           "from %p\n", sig, info->si_addr, 
           uc->uc_mcontext.gregs[REG_EIP]);
  else
    printf("Got signal %d\n", sig);

  trace_size = backtrace(trace, 16);
  /* overwrite sigaction with caller's address */
  trace[1] = (void *) uc->uc_mcontext.gregs[REG_EIP];

  messages = backtrace_symbols(trace, trace_size);
  /* skip first stack frame (points here) */
  printf("[bt] Execution path:\n");
  for (i=1; i<trace_size; ++i)
  {
    printf("[bt] %s\n", messages[i]);

    /* find first occurence of '(' or ' ' in message[i] and assume
     * everything before that is the file name. (Don't go beyond 0 though
     * (string terminator)*/
    size_t p = 0;
    while(messages[i][p] != '(' && messages[i][p] != ' '
            && messages[i][p] != 0)
        ++p;

    char syscom[256];
    sprintf(syscom,"addr2line %p -e %.*s", trace[i] , p, messages[i] );
           //last parameter is the filename of the symbol
    system(syscom);

  }
  exit(0);
}

and initialise like this:

int main() {

  /* Install our signal handler */
  struct sigaction sa;

  sa.sa_sigaction = (void *)bt_sighandler;
  sigemptyset (&sa.sa_mask);
  sa.sa_flags = SA_RESTART | SA_SIGINFO;

  sigaction(SIGSEGV, &sa, NULL);
  sigaction(SIGUSR1, &sa, NULL);
  /* ... add any other signal here */

  /* Do something */
  printf("%d\n", func_b());

}

Sweetener answered 30/6, 2010 at 17:28 Comment(13)

"Method #1" -> There is my other question on SO about how to "propogate" it automatically, but with no useful answers. – Chromogenic 6/1, 2011 at 15:2

Methods #2 - #4 -> Already tried - it works: vi-server.org/vi/simple_sampling_profiler.html But backtrace/addr2line approach have limitation: 1. often addr2line cannot figure out the line (while gdb can), 2. gdb can iterate threads: "thread apply all bt". – Chromogenic 6/1, 2011 at 15:3

@Vi This guy nailed it: https://mcmap.net/q/65974/-how-to-get-a-stack-trace-for-c-using-gcc-with-line-number-information-duplicate/… – Sweetener 19/1, 2011 at 18:1

@karlphillip: I found another way to put file and line numbers to the stacktrace. Use libbfd (sourceware.org/binutils/docs-2.21/bfd/…) as they did in refdbg: refdbg.cvs.sourceforge.net/viewvc/refdbg/refdbg/… I did not tried it myself yet. – Quitrent 2/11, 2011 at 14:19

@karlphillip, it seems that this sa_handler signature is obsolete now (see tutorialspoint.com/unix_system_calls/sigaction.htm) i'm trying to determine what needs to change in recent linux versions – Synergistic 28/4, 2012 at 14:37

@Sweetener i found an updated version in here: linuxjournal.com/files/linuxjournal.com/linuxjournal/articles/… – Synergistic 28/4, 2012 at 14:47

You seem to have missed a ',' after FUNCTION in method#1. – Lorusso 25/4, 2013 at 11:12

@Sweetener I tried your solution and looked into the linux journal article, however I need to do what is being done in that program without a crash in the program. Basically I am trying to implement a custom exception class which will print the backtrace and line numbers, etc when an exception is caught. So, I do not have a SIGSEGV coming into the picture. Any thoughts on how that might be achieved ? – Arlyne 5/6, 2013 at 14:58

This doesn't compile on OS X. Is it supposed to? – Samurai 22/1, 2014 at 14:48

You have an error in your initialiseExecutableName code: readlink does not append a string-terminating null to exe, so you have to use the return result to figure out the size and insert the null yourself (you'll want to make sure you have enough room for the null, too). Also, you should be using "sizeof exe" instead of "sizeof link" when calling it. – Inextensible 19/5, 2014 at 5:59

Note that for x64 code you need to use REG_RIP instead of REG_EIP – Counterintelligence 9/9, 2014 at 18:21

backtrace_symbols() gives me the names of the files where the symbols reside. I edited your answer to use those file names instead of a fixed one. That way, it is possible to find the lines of shared libraries as well (given that they were compiled for debugging). – Ge 10/4, 2015 at 15:59

In addition to using -rdynamic, also check that your build system doesn't add -fvisibility=hidden option! (as it will completely discard the effect of -rdynamic) – Dinodinoflagellate 13/5, 2020 at 0:43

If you're using Linux, the standard C library includes a function called backtrace, which populates an array with frames' return addresses, and another function called backtrace_symbols, which will take the addresses from backtrace and look up the corresponding function names. These are documented in the GNU C Library manual.

Those won't show argument values, source lines, and the like, and they only apply to the calling thread. However, they should be a lot faster (and perhaps less flaky) than running GDB that way, so they have their place.

Squint answered 30/6, 2010 at 17:28 Comment(1)

Actually snippet that I insert into program firstly outputs backtrace with backtrace_symbols and then starts gdb to output fully annotated stack traces for all threads. If gdb fails, I still has the backtrace's stacktrace. – Chromogenic 4/8, 2010 at 16:51

nobar posted a fantastic answer. In short;

So you want a stand-alone function that prints a stack trace with all of the features that gdb stack traces have and that doesn't terminate your application. The answer is to automate the launch of gdb in a non-interactive mode to perform just the tasks that you want.

This is done by executing gdb in a child process, using fork(), and scripting it to display a stack-trace while your application waits for it to complete. This can be performed without the use of a core-dump and without aborting the application.

I believe that this is what you are looking for, @Vi

Sweetener answered 30/6, 2010 at 17:28 Comment(2)

Look at the sample code in the question. It is that method. I'm looking for other, less heavyweight ways. The main problem of addr2line-quality things that it often cannot display line number where gdb can. – Chromogenic 20/1, 2011 at 0:5

@Vi It is stated in his answer that he got the base code from your question in this thread. However, if you look more closely you will see that there are some differences. Have you tried it? – Sweetener 20/1, 2011 at 0:43

Isn't abort() simpler?

That way if it happens in the field the customer can send you the core file (I don't know many users who are involved enough in my application to want me to force them to debug it).

Cyclograph answered 30/6, 2010 at 18:30 Comment(3)

I don't need to abort. I need a stack trace. Program can continue after printing it. And I like the verbosity of "bt full" – Chromogenic 30/6, 2010 at 19:7

Also print_trace() way is rather unintrusive. If gdb in not found the program can just continue without printing a stacktrace. – Chromogenic 30/6, 2010 at 19:11

@Vi, OK sorry I wasn't any help :o/ – Cyclograph 1/7, 2010 at 6:4

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags