How to get a stack trace for C++ using gcc with line number information? [duplicate]
Asked Answered
M

15

73

We use stack traces in proprietary assert like macro to catch developer mistakes - when error is caught, stack trace is printed.

I find gcc's pair backtrace()/backtrace_symbols() methods insufficient:

  1. Names are mangled
  2. No line information

1st problem can be resolved by abi::__cxa_demangle.

However 2nd problem s more tough. I found replacement for backtrace_symbols(). This is better than gcc's backtrace_symbols(), since it can retrieve line numbers (if compiled with -g) and you don't need to compile with -rdynamic.

Hoverer the code is GNU licenced, so IMHO I can't use it in commercial code.

Any proposal?

P.S.

gdb is capable to print out arguments passed to functions. Probably it's already too much to ask for :)

PS 2

Similar question (thanks nobar)

Misshapen answered 8/1, 2011 at 22:15 Comment(4)
Either find the author and pay him or reimplement it yourself.Lemures
I'm not sure if using compiled GNU code on your commercial application is the same as modifying/customize the GNU code itself to distribute inside your app. Anyone?Scone
Is it for Linux/x86 only or you should this code run on different platforms?Buskined
No line number requirement: https://mcmap.net/q/64721/-how-to-print-a-stack-trace-whenever-a-certain-function-is-calledAcquit
S
37

Not too long ago I answered a similar question. You should take a look at the source code available on method #4, which also prints line numbers and filenames.

  • Method #4:

A small improvement I've done on method #3 to print line numbers. This could be copied to work on method #2 also.

Basically, it uses addr2line to convert addresses into file names and line numbers.

The source code below prints line numbers for all local functions. If a function from another library is called, you might see a couple of ??:0 instead of file names.

#include <stdio.h>
#include <signal.h>
#include <stdio.h>
#include <signal.h>
#include <execinfo.h>

void bt_sighandler(int sig, struct sigcontext ctx) {

  void *trace[16];
  char **messages = (char **)NULL;
  int i, trace_size = 0;

  if (sig == SIGSEGV)
    printf("Got signal %d, faulty address is %p, "
           "from %p\n", sig, ctx.cr2, ctx.eip);
  else
    printf("Got signal %d\n", sig);

  trace_size = backtrace(trace, 16);
  /* overwrite sigaction with caller's address */
  trace[1] = (void *)ctx.eip;
  messages = backtrace_symbols(trace, trace_size);
  /* skip first stack frame (points here) */
  printf("[bt] Execution path:\n");
  for (i=1; i<trace_size; ++i)
  {
    printf("[bt] #%d %s\n", i, messages[i]);

    /* find first occurence of '(' or ' ' in message[i] and assume
     * everything before that is the file name. (Don't go beyond 0 though
     * (string terminator)*/
    size_t p = 0;
    while(messages[i][p] != '(' && messages[i][p] != ' '
            && messages[i][p] != 0)
        ++p;

    char syscom[256];
    sprintf(syscom,"addr2line %p -e %.*s", trace[i], p, messages[i]);
        //last parameter is the file name of the symbol
    system(syscom);
  }

  exit(0);
}


int func_a(int a, char b) {

  char *p = (char *)0xdeadbeef;

  a = a + b;
  *p = 10;  /* CRASH here!! */

  return 2*a;
}


int func_b() {

  int res, a = 5;

  res = 5 + func_a(a, 't');

  return res;
}


int main() {

  /* Install our signal handler */
  struct sigaction sa;

  sa.sa_handler = (void *)bt_sighandler;
  sigemptyset(&sa.sa_mask);
  sa.sa_flags = SA_RESTART;

  sigaction(SIGSEGV, &sa, NULL);
  sigaction(SIGUSR1, &sa, NULL);
  /* ... add any other signal here */

  /* Do something */
  printf("%d\n", func_b());
}

This code should be compiled as: gcc sighandler.c -o sighandler -rdynamic

The program outputs:

Got signal 11, faulty address is 0xdeadbeef, from 0x8048975
[bt] Execution path:
[bt] #1 ./sighandler(func_a+0x1d) [0x8048975]
/home/karl/workspace/stacktrace/sighandler.c:44
[bt] #2 ./sighandler(func_b+0x20) [0x804899f]
/home/karl/workspace/stacktrace/sighandler.c:54
[bt] #3 ./sighandler(main+0x6c) [0x8048a16]
/home/karl/workspace/stacktrace/sighandler.c:74
[bt] #4 /lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xe6) [0x3fdbd6]
??:0
[bt] #5 ./sighandler() [0x8048781]
??:0
Scone answered 12/1, 2011 at 22:5 Comment(12)
Remember to compile your application with -rdynamic.Scone
@karlphillip, about GPL. if the GPL (not GNU, but GPL-licensed) code is linked (with ld.so or ld) into another code, the GPL require that another code is available under GPL. This all is only true in case when the application is transfered to another people. Personally you can do anything with GPL code and link it with anything.Buskined
@Buskined And what happens if your app uses dynamically linked GPL libraries available on the system target. Same rule?Scone
For the many dynamic libraries there is a LGPL license, which allows linking.Buskined
I accept the answer, since the answer is most close for what I want.Misshapen
@dimba: Don't forget to award him the bounty as well (just underneath the accept tick!!)Hudspeth
@Scone I tried your solution and looked into the linux journal article, however I need to do what is being done in that program without a crash in the program. Basically I am trying to implement a custom exception class which will print the backtrace and line numbers, etc when an exception is caught. So, I do not have a SIGSEGV coming into the picture. Any thoughts on how that might be achieved ?Vertievertiginous
error: ‘struct sigcontext’ has no member named ‘eip’; did you mean ‘rip’?Zucker
@Zucker I fixed the error as suggested by replacing 'eip' with 'rip'.Survivor
I don't think this answer is valid anymore. I've compiled the code (applying eip -> rip fix) on Ubunu 18 and didn't get a single line correct: Got signal 11, faulty address is 0x10202, from (nil) [bt] Execution path: [bt] #1 [(nil)] sh: 1: Syntax error: word unexpected (expecting ")") [bt] #2 ./sighandler(func_a+0x20) [0x55b0d4f96dad] ??:0 [bt] #3 ./sighandler(func_b+0x1e) [0x55b0d4f96dd5] ??:0 [bt] #4 ./sighandler(main+0x7e) [0x55b0d4f96e5e] ??:0 Duckweed
Run it inside GDB and identify the offending line.Scone
This isn't going to work with aslr.Hendrick
S
48

So you want a stand-alone function that prints a stack trace with all of the features that gdb stack traces have and that doesn't terminate your application. The answer is to automate the launch of gdb in a non-interactive mode to perform just the tasks that you want.

This is done by executing gdb in a child process, using fork(), and scripting it to display a stack-trace while your application waits for it to complete. This can be performed without the use of a core-dump and without aborting the application. I learned how to do this from looking at this question: How it's better to invoke gdb from program to print it's stacktrace?

The example posted with that question didn't work for me exactly as written, so here's my "fixed" version (I ran this on Ubuntu 9.04).

#include <stdio.h>
#include <stdlib.h>
#include <sys/wait.h>
#include <unistd.h>
#include <sys/prctl.h>

void print_trace() {
    char pid_buf[30];
    sprintf(pid_buf, "%d", getpid());
    char name_buf[512];
    name_buf[readlink("/proc/self/exe", name_buf, 511)]=0;
    prctl(PR_SET_PTRACER, PR_SET_PTRACER_ANY, 0, 0, 0);
    int child_pid = fork();
    if (!child_pid) {
        dup2(2,1); // redirect output to stderr - edit: unnecessary?
        execl("/usr/bin/gdb", "gdb", "--batch", "-n", "-ex", "thread", "-ex", "bt", name_buf, pid_buf, NULL);
        abort(); /* If gdb failed to start */
    } else {
        waitpid(child_pid,NULL,0);
    }
}

As shown in the referenced question, gdb provides additional options that you could use. For example, using "bt full" instead of "bt" produces an even more detailed report (local variables are included in the output). The manpages for gdb are kind of light, but complete documentation is available here.

Since this is based on gdb, the output includes demangled names, line-numbers, function arguments, and optionally even local variables. Also, gdb is thread-aware, so you should be able to extract some thread-specific metadata.

Here's an example of the kind of stack traces that I see with this method.

0x00007f97e1fc2925 in waitpid () from /lib/libc.so.6
[Current thread is 0 (process 15573)]
#0  0x00007f97e1fc2925 in waitpid () from /lib/libc.so.6
#1  0x0000000000400bd5 in print_trace () at ./demo3b.cpp:496
2  0x0000000000400c09 in recursive (i=2) at ./demo3b.cpp:636
3  0x0000000000400c1a in recursive (i=1) at ./demo3b.cpp:646
4  0x0000000000400c1a in recursive (i=0) at ./demo3b.cpp:646
5  0x0000000000400c46 in main (argc=1, argv=0x7fffe3b2b5b8) at ./demo3b.cpp:70

Note: I found this to be incompatible with the use of valgrind (probably due to Valgrind's use of a virtual machine). It also doesn't work when you are running the program inside of a gdb session (can't apply a second instance of "ptrace" to a process).

Smokestack answered 19/1, 2011 at 5:42 Comment(16)
@nobar +1 Good! When it prints the line numbers it would be even better.Scone
It does print the line numbers for me. What makes you say that it doesn't?Smokestack
@nobar The fact that on my system, it doesn't! And I'm compiling with -rdynamic and -g. How are you compiling the test application? I'm using GDB 7.1, how about you?Scone
@karlphillip: I'm only using "-g" to compile. My gdb is version "6.8-debian". The current gdb documentation says that it will print line numbers in a back-trace: "The backtrace also shows the source file name and line number, as well as the arguments to the function." Does your test application work with a debugger (can you single-step through your source lines)?Smokestack
@nobar I apologize, I can see it now: #3 0x080489d5 in main () at stacktrace_test.cpp:29 You should add a reference to this answer at the other question, which hasn't been answered yet. Thank you.Scone
@karlphillip: Great! I edited my answer to include example output.Smokestack
DO NOT USE THIS! I used the above function verbatim in my program, and on Ubuntu 12.04 it completely crashes the X Server.Close
@BeniBela, what kind of program were you running? Was it something low-level? This approach works fine for me in Fedora 17.Depose
I realize this can be repurpose to give you an interactive debugger session to inspect the moribund process, by removing the "--batch". I wonder if there a simple way to use gdb to make the process resume from where it left off, causing the original signal to be rethrown and caught by gdb.Depose
@Syncopated: No, a normal qt desktop application. Perhaps it is caused some kind of input wrapper (like Dbus, or so??) which connects to the original application and the fork, and then blocks the inputClose
And it is getting worse: ptracing the parent is now no longer permitted. But perhaps there is a flag you can set with prctl?Close
@BeniBela: Thanks for the pointer. One possible workaround is to run with sudo.Smokestack
You can bypass it with #include <sys/prctl.h> prctl(PR_SET_PTRACER, PR_SET_PTRACER_ANY, 0, 0, 0); before fork().Westernize
execl is safer than execlp and works perfectly, tooMaynard
@GiovanniFunchal All, I've updated the answer such that it works again (without prctl() it would not), according to the comments. Thanks! I also removed the fprintf() call because it wasn't outputting anything, and I'm not sure if the dup2() call is needed/helpful either.Tubular
@PatrizioBertoni It does work; I had to change gdb to /usr/bin/gdb to get it to work, as seen in my edit. That was my understanding of what needed to be done from reading the docs, and it works.Tubular
S
37

Not too long ago I answered a similar question. You should take a look at the source code available on method #4, which also prints line numbers and filenames.

  • Method #4:

A small improvement I've done on method #3 to print line numbers. This could be copied to work on method #2 also.

Basically, it uses addr2line to convert addresses into file names and line numbers.

The source code below prints line numbers for all local functions. If a function from another library is called, you might see a couple of ??:0 instead of file names.

#include <stdio.h>
#include <signal.h>
#include <stdio.h>
#include <signal.h>
#include <execinfo.h>

void bt_sighandler(int sig, struct sigcontext ctx) {

  void *trace[16];
  char **messages = (char **)NULL;
  int i, trace_size = 0;

  if (sig == SIGSEGV)
    printf("Got signal %d, faulty address is %p, "
           "from %p\n", sig, ctx.cr2, ctx.eip);
  else
    printf("Got signal %d\n", sig);

  trace_size = backtrace(trace, 16);
  /* overwrite sigaction with caller's address */
  trace[1] = (void *)ctx.eip;
  messages = backtrace_symbols(trace, trace_size);
  /* skip first stack frame (points here) */
  printf("[bt] Execution path:\n");
  for (i=1; i<trace_size; ++i)
  {
    printf("[bt] #%d %s\n", i, messages[i]);

    /* find first occurence of '(' or ' ' in message[i] and assume
     * everything before that is the file name. (Don't go beyond 0 though
     * (string terminator)*/
    size_t p = 0;
    while(messages[i][p] != '(' && messages[i][p] != ' '
            && messages[i][p] != 0)
        ++p;

    char syscom[256];
    sprintf(syscom,"addr2line %p -e %.*s", trace[i], p, messages[i]);
        //last parameter is the file name of the symbol
    system(syscom);
  }

  exit(0);
}


int func_a(int a, char b) {

  char *p = (char *)0xdeadbeef;

  a = a + b;
  *p = 10;  /* CRASH here!! */

  return 2*a;
}


int func_b() {

  int res, a = 5;

  res = 5 + func_a(a, 't');

  return res;
}


int main() {

  /* Install our signal handler */
  struct sigaction sa;

  sa.sa_handler = (void *)bt_sighandler;
  sigemptyset(&sa.sa_mask);
  sa.sa_flags = SA_RESTART;

  sigaction(SIGSEGV, &sa, NULL);
  sigaction(SIGUSR1, &sa, NULL);
  /* ... add any other signal here */

  /* Do something */
  printf("%d\n", func_b());
}

This code should be compiled as: gcc sighandler.c -o sighandler -rdynamic

The program outputs:

Got signal 11, faulty address is 0xdeadbeef, from 0x8048975
[bt] Execution path:
[bt] #1 ./sighandler(func_a+0x1d) [0x8048975]
/home/karl/workspace/stacktrace/sighandler.c:44
[bt] #2 ./sighandler(func_b+0x20) [0x804899f]
/home/karl/workspace/stacktrace/sighandler.c:54
[bt] #3 ./sighandler(main+0x6c) [0x8048a16]
/home/karl/workspace/stacktrace/sighandler.c:74
[bt] #4 /lib/tls/i686/cmov/libc.so.6(__libc_start_main+0xe6) [0x3fdbd6]
??:0
[bt] #5 ./sighandler() [0x8048781]
??:0
Scone answered 12/1, 2011 at 22:5 Comment(12)
Remember to compile your application with -rdynamic.Scone
@karlphillip, about GPL. if the GPL (not GNU, but GPL-licensed) code is linked (with ld.so or ld) into another code, the GPL require that another code is available under GPL. This all is only true in case when the application is transfered to another people. Personally you can do anything with GPL code and link it with anything.Buskined
@Buskined And what happens if your app uses dynamically linked GPL libraries available on the system target. Same rule?Scone
For the many dynamic libraries there is a LGPL license, which allows linking.Buskined
I accept the answer, since the answer is most close for what I want.Misshapen
@dimba: Don't forget to award him the bounty as well (just underneath the accept tick!!)Hudspeth
@Scone I tried your solution and looked into the linux journal article, however I need to do what is being done in that program without a crash in the program. Basically I am trying to implement a custom exception class which will print the backtrace and line numbers, etc when an exception is caught. So, I do not have a SIGSEGV coming into the picture. Any thoughts on how that might be achieved ?Vertievertiginous
error: ‘struct sigcontext’ has no member named ‘eip’; did you mean ‘rip’?Zucker
@Zucker I fixed the error as suggested by replacing 'eip' with 'rip'.Survivor
I don't think this answer is valid anymore. I've compiled the code (applying eip -> rip fix) on Ubunu 18 and didn't get a single line correct: Got signal 11, faulty address is 0x10202, from (nil) [bt] Execution path: [bt] #1 [(nil)] sh: 1: Syntax error: word unexpected (expecting ")") [bt] #2 ./sighandler(func_a+0x20) [0x55b0d4f96dad] ??:0 [bt] #3 ./sighandler(func_b+0x1e) [0x55b0d4f96dd5] ??:0 [bt] #4 ./sighandler(main+0x7e) [0x55b0d4f96e5e] ??:0 Duckweed
Run it inside GDB and identify the offending line.Scone
This isn't going to work with aslr.Hendrick
S
11

There is a robust discussion of essentially the same question at: How to generate a stacktrace when my gcc C++ app crashes. Many suggestions are provided, including lots of discussion about how to generate stack traces at run-time.

My personal favorite answer from that thread was to enable core dumps which allows you to view the complete application state at the time of the crash (including function arguments, line numbers, and unmangled names). An additional benefit of this approach is that it not only works for asserts, but also for segmentation faults and unhandled exceptions.

Different Linux shells use different commands to enable core dumps, but you can do it from within your application code with something like this...

#include <sys/resource.h>
...
struct rlimit core_limit = { RLIM_INFINITY, RLIM_INFINITY };
assert( setrlimit( RLIMIT_CORE, &core_limit ) == 0 ); // enable core dumps for debug builds

After a crash, run your favorite debugger to examine the program state.

$ kdbg executable core

Here's some sample output...

alt text

It is also possible to extract the stack trace from a core dump at the command line.

$ ( CMDFILE=$(mktemp); echo "bt" >${CMDFILE}; gdb 2>/dev/null --batch -x ${CMDFILE} temp.exe core )
Core was generated by `./temp.exe'.
Program terminated with signal 6, Aborted.
[New process 22857]
#0  0x00007f4189be5fb5 in raise () from /lib/libc.so.6
#0  0x00007f4189be5fb5 in raise () from /lib/libc.so.6
#1  0x00007f4189be7bc3 in abort () from /lib/libc.so.6
#2  0x00007f4189bdef09 in __assert_fail () from /lib/libc.so.6
#3  0x00000000004007e8 in recursive (i=5) at ./demo1.cpp:18
#4  0x00000000004007f3 in recursive (i=4) at ./demo1.cpp:19
#5  0x00000000004007f3 in recursive (i=3) at ./demo1.cpp:19
#6  0x00000000004007f3 in recursive (i=2) at ./demo1.cpp:19
#7  0x00000000004007f3 in recursive (i=1) at ./demo1.cpp:19
#8  0x00000000004007f3 in recursive (i=0) at ./demo1.cpp:19
#9  0x0000000000400849 in main (argc=1, argv=0x7fff2483bd98) at ./demo1.cpp:26
Smokestack answered 16/1, 2011 at 22:40 Comment(4)
gdb is for post mortal analysis. I'm looking more how to receive the information from inside code. Maybe I want to print backtrace not in case of SIGSEV - for example to see where from unhandled C++ exception is thrown from.Misshapen
An unhandled exception WILL generate a core-dump that you can use to analyze the stack at the time of the throw -- so this answer works for that.Smokestack
On the other hand, if you want to generate a stack trace without terminating the program, I posted another answer that addresses that requirement.Smokestack
Not good if you have the binary built on an alien system (from open source code), you can't understand the core dump on your side. And you can't ask some user to run such commands.Manchukuo
A
6

Since the GPL licensed code is intended to help you during development, you could simply not include it in the final product. The GPL restricts you from distributing GPL licenses code linked with non-GPL compatible code. As long as you only use the GPL code inhouse, you should be fine.

Accusative answered 12/1, 2011 at 22:31 Comment(2)
Dynamically linked is probably (almost certainly) fine, because the source is still open. Asserts and profiling code that goes into your executable... @Keith is right, use it in house.Enrica
Dynamic linking, as far as I know, hasn't been tested in court. While the FSF argues that it is not allowed, others have different opinions. Wikipedia has a good discussion of the different views on this. en.wikipedia.org/wiki/…Accusative
S
6

Here's an alternative approach. A debug_assert() macro programmatically sets a conditional breakpoint. If you are running in a debugger, you will hit a breakpoint when the assert expression is false -- and you can analyze the live stack (the program doesn't terminate). If you are not running in a debugger, a failed debug_assert() causes the program to abort and you get a core dump from which you can analyze the stack (see my earlier answer).

The advantage of this approach, compared to normal asserts, is that you can continue running the program after the debug_assert is triggered (when running in a debugger). In other words, debug_assert() is slightly more flexible than assert().

   #include <iostream>
   #include <cassert>
   #include <sys/resource.h> 

// note: The assert expression should show up in
// stack trace as parameter to this function
void debug_breakpoint( char const * expression )
   {
   asm("int3"); // x86 specific
   }

#ifdef NDEBUG
   #define debug_assert( expression )
#else
// creates a conditional breakpoint
   #define debug_assert( expression ) \
      do { if ( !(expression) ) debug_breakpoint( #expression ); } while (0)
#endif

void recursive( int i=0 )
   {
   debug_assert( i < 5 );
   if ( i < 10 ) recursive(i+1);
   }

int main( int argc, char * argv[] )
   {
   rlimit core_limit = { RLIM_INFINITY, RLIM_INFINITY };
   setrlimit( RLIMIT_CORE, &core_limit ); // enable core dumps
   recursive();
   }

Note: Sometimes "conditional breakpoints" setup within debuggers can be slow. By establishing the breakpoint programmatically, the performance of this method should be equivalent to that of a normal assert().

Note: As written, this is specific to the Intel x86 architecture -- other processors may have different instructions for generating a breakpoint.

Smokestack answered 18/1, 2011 at 4:8 Comment(1)
I once used something similar, but with an empty function debug_breakpoint. When debugging, I simply entered "bre debug_breakpoint" at the gdb prompt - no asembler needed (compile debug_breakpoint in a separate compilation unit to avoid having the call optimized away).Crammer
T
5

Use the google glog library for it. It has new BSD licence.

It contains a GetStackTrace function in the stacktrace.h file.

EDIT

I found here http://blog.bigpixel.ro/2010/09/09/stack-unwinding-stack-trace-with-gcc/ that there is an utility called addr2line that translates program addresses into file names and line numbers.

http://linuxcommand.org/man_pages/addr2line1.html

Tragicomedy answered 8/1, 2011 at 22:21 Comment(2)
Indeed glog has stack trace (google-glog.googlecode.com/svn/trunk/doc/glog.html Failure Signal Handler section), but it has no code line information.Misshapen
google-glog is a thin wrapper over backtrace and backtrace_symbols. It won't give you filenames and line numbersAngadresma
T
5

A bit late, but you can use libbfb to fetch the filename and linenumber like refdbg does in symsnarf.c. libbfb is internally used by addr2line and gdb

Tirewoman answered 2/11, 2011 at 14:9 Comment(0)
J
5

here is my solution:

#include <execinfo.h>
#include <string.h>
#include <errno.h>
#include <unistd.h>
#include <stdlib.h>
#include <iostream>
#include <zconf.h>
#include "regex"

std::string getexepath() {
    char result[PATH_MAX];
    ssize_t count = readlink("/proc/self/exe", result, PATH_MAX);
    return std::string(result, (count > 0) ? count : 0);
}

std::string sh(std::string cmd) {
    std::array<char, 128> buffer;
    std::string result;
    std::shared_ptr<FILE> pipe(popen(cmd.c_str(), "r"), pclose);
    if (!pipe) throw std::runtime_error("popen() failed!");
    while (!feof(pipe.get())) {
        if (fgets(buffer.data(), 128, pipe.get()) != nullptr) {
            result += buffer.data();
        }
    }
    return result;
}


void print_backtrace(void) {
    void *bt[1024];
    int bt_size;
    char **bt_syms;
    int i;

    bt_size = backtrace(bt, 1024);
    bt_syms = backtrace_symbols(bt, bt_size);
    std::regex re("\\[(.+)\\]");
    auto exec_path = getexepath();
    for (i = 1; i < bt_size; i++) {
        std::string sym = bt_syms[i];
        std::smatch ms;
        if (std::regex_search(sym, ms, re)) {
            std::string addr = ms[1];
            std::string cmd = "addr2line -e " + exec_path + " -f -C " + addr;
            auto r = sh(cmd);
            std::regex re2("\\n$");
            auto r2 = std::regex_replace(r, re2, "");
            std::cout << r2 << std::endl;
        }
    }
    free(bt_syms);
}

void test_m() {
    print_backtrace();
}

int main() {
    test_m();
    return 0;
}

output:

/home/roroco/Dropbox/c/ro-c/cmake-build-debug/ex/test_backtrace_with_line_number
test_m()
/home/roroco/Dropbox/c/ro-c/ex/test_backtrace_with_line_number.cpp:57
main
/home/roroco/Dropbox/c/ro-c/ex/test_backtrace_with_line_number.cpp:61
??
??:0

"??" and "??:0" since this trace is in libc, not in my source

Joann answered 17/8, 2018 at 7:24 Comment(0)
B
2

The one of solutions is to start a gdb with "bt"-script in failed assert handler. It is not very easy to integrate such gdb-starting, but It will give you both backtrace and args and demangle names (or you can pass gdb output via c++filt programm).

Both programms (gdb and c++filt) will be not linked into your application, so GPL will not require you to opensource complete application.

The same approach (exec a GPL programme) you can use with backtrace-symbols. Just generate ascii list of %eip's and map of exec file (/proc/self/maps) and pass it to separate binary.

Buskined answered 12/1, 2011 at 22:12 Comment(0)
N
2

You can use DeathHandler - small C++ class which does everything for you, reliable.

Nephogram answered 1/3, 2013 at 6:39 Comment(1)
This is the same as exec'ing addr2line manually.Dickie
S
2

I had to do this in a production environment with many constraints, so I wanted to explain the advantages and disadvantages of the already posted methods.

  1. attach GDB

+ very simple and robust

- Slow for large programs because GDB insists on loading the entire address to line # database upfront instead of lazily

- Interferes with signal handling. When GDB is attached, it intercepts signals like SIGINT (ctrl-c), which will cause the program to get stuck at the GDB interactive prompt? if some other process routinely sends such signals. Maybe there's some way around it, but this made GDB unusable in my case. You can still use it if you only care about printing a call stack once when your program crashes, but not multiple times.

  1. addr2line. Here's an alternate solution that doesn't use backtrace_symbols.

+ Doesn't allocate from the heap, which is unsafe inside a signal handler

+ Don't need to parse output of backtrace_symbols

- Won't work on MacOS, which doesn't have dladdr1. You can use _dyld_get_image_vmaddr_slide instead, which returns the same offset as link_map::l_addr.

- Requires adding negative offset or else the translated line # will be 1 greater. backtrace_symbols does this for you

#include <execinfo.h>
#include <link.h>
#include <stdlib.h>
#include <stdio.h>

// converts a function's address in memory to its VMA address in the executable file. VMA is what addr2line expects
size_t ConvertToVMA(size_t addr)
{
  Dl_info info;
  link_map* link_map;
  dladdr1((void*)addr,&info,(void**)&link_map,RTLD_DL_LINKMAP);
  return addr-link_map->l_addr;
}

void PrintCallStack()
{
  void *callstack[128];
  int frame_count = backtrace(callstack, sizeof(callstack)/sizeof(callstack[0]));
  for (int i = 0; i < frame_count; i++)
  {
    char location[1024];
    Dl_info info;
    if(dladdr(callstack[i],&info))
    {
      char command[256];
      size_t VMA_addr=ConvertToVMA((size_t)callstack[i]);
      //if(i!=crash_depth)
        VMA_addr-=1;    // https://mcmap.net/q/66716/-wrong-line-numbers-from-addr2line/63841497#63841497
      snprintf(command,sizeof(command),"addr2line -e %s -Ci %zx",info.dli_fname,VMA_addr);
      system(command);
    }
  }
}

void Foo()
{
  PrintCallStack();
}

int main()
{
  Foo();
  return 0;
}

I also want to clarify what addresses backtrace and backtrace_symbols generate and what addr2line expects. enter image description here addr2line expects FooVMA or if you're using --section=.text, then Foofile - textfile. backtrace returns Foomem. backtrace_symbols generates FooVMA somewhere. One big mistake I made and saw in several other posts was assuming VMAbase = 0 or FooVMA = Foofile = Foomem - ELFmem, which is easy to calculate. That often works, but for some compilers (i.e. linker scripts) use VMAbase > 0. Examples would be the GCC 5.4 on Ubuntu 16 (0x400000) and clang 11 on MacOS (0x100000000). For shared libs, it's always 0. Seems VMAbase was only meaningful for non-position independent code. Otherwise it has no effect on where the EXE is loaded in memory.

Also, neither karlphillip's nor this one requires compiling with -rdynamic. That will increase the binary size, especially for a large C++ program or shared lib, with useless entries in the dynamic symbol table that never get imported

Systematology answered 11/9, 2020 at 22:56 Comment(2)
Could you clarify how I could make this work on mac using _dyld_get_image_vmaddr_slide?Pearsall
@Pearsall Sure, I already said _dyld_get_image_vmaddr_slide returns the difference between the module's actual base address in memory and VMAbase, or Elf_mem - VMA_base = Foo_mem - Foo_VMA in the picture. So in ConvertToVMA, just return addr - _dyld_get_image_vmaddr_slide (module_index). I don't know of an easy way to find which module an address falls into, so I ended up assuming it's always 0, the main EXE.Systematology
H
1

I suppose line numbers are related to current eip value, right?

SOLUTION 1:
Then you can use something like GetThreadContext(), except that you're working on linux. I googled around a bit and found something similar, ptrace():

The ptrace() system call provides a means by which a parent process may observe and control the execution of another process, and examine and change its core image and registers. [...] The parent can initiate a trace by calling fork(2) and having the resulting child do a PTRACE_TRACEME, followed (typically) by an exec(3). Alternatively, the parent may commence trace of an existing process using PTRACE_ATTACH.

Now I was thinking, you can do a 'main' program which checks for signals that are sent to its child, the real program you're working on. after fork() it call waitid():

All of these system calls are used to wait for state changes in a child of the calling process, and obtain information about the child whose state has changed.

and if a SIGSEGV (or something similar) is caught call ptrace() to obtain eip's value.

PS: I've never used these system calls (well, actually, I've never seen them before ;) so I don't know if it's possible neither can help you. At least I hope these links are useful. ;)

SOLUTION 2: The first solution is quite complicated, right? I came up with a much simpler one: using signal() catch the signals you are interested in and call a simple function that reads the eip value stored in the stack:

...
signal(SIGSEGV, sig_handler);
...

void sig_handler(int signum)
{
    int eip_value;

    asm {
        push eax;
        mov eax, [ebp - 4]
        mov eip_value, eax
        pop eax
    }

    // now you have the address of the
    // **next** instruction after the
    // SIGSEGV was received
}

That asm syntax is Borland's one, just adapt it to GAS. ;)

Haleakala answered 12/1, 2011 at 22:20 Comment(9)
there are zero solution in you "post". this is not looks like an answerBuskined
@osgx: can you please explain why? These solutions can give you the position where a error happened.Haleakala
It is not complete solution. Its obviously that for externally getting an info from binary is ptrace. And it is obviously to take a eip from a stack. But you give no any answer how to get entire backtrace and convert it to function names and source file line info.Buskined
@osgx: you have to convert absolute address (eip) into relative address (offset from every function). The smallest offset gives the function where the error happened.Haleakala
@osgx: I'm curious to hear how you get from "this is not a complete solution" to the statement that "there is zero solution in your post". Partial solutions are both acceptable and valuable on SO.Otero
@BlackBear, this can be used only for top function, but not for complete function callstack. Also, this functionality is inside glibc already (backtrace)Buskined
@jalf, the problem in question is to get backtrace, convert it to function names and source file lines. This answer is __very partial, it is only trys to get the current function in runtime or it suggest we must use external process and ptrace.Buskined
@osgx: wouldn't be possible to keep log of called functions then?Haleakala
@Haleakala - there is such log already. Every function must know address to return to after it run. This log is called function call stack. yosefk.com/blog/… - there is one of examples, how it can be get from stack.Buskined
S
1

Here's my third answer -- still trying to take advantage of core dumps.

It wasn't completely clear in the question whether the "assert-like" macros were supposed to terminate the application (the way assert does) or they were supposed to continue executing after generating their stack-trace.

In this answer, I'm addressing the case where you want to show a stack-trace and continue executing. I wrote the coredump() function below to generate a core dump, automatically extract the stack-trace from it, then continue executing the program.

Usage is the same as that of assert(). The difference, of course, is that assert() terminates the program but coredump_assert() does not.

   #include <iostream>
   #include <sys/resource.h> 
   #include <cstdio>
   #include <cstdlib>
   #include <boost/lexical_cast.hpp>
   #include <string>
   #include <sys/wait.h>
   #include <unistd.h>

   std::string exename;

// expression argument is for diagnostic purposes (shows up in call-stack)
void coredump( char const * expression )
   {

   pid_t childpid = fork();

   if ( childpid == 0 ) // child process generates core dump
      {
      rlimit core_limit = { RLIM_INFINITY, RLIM_INFINITY };
      setrlimit( RLIMIT_CORE, &core_limit ); // enable core dumps
      abort(); // terminate child process and generate core dump
      }

// give each core-file a unique name
   if ( childpid > 0 ) waitpid( childpid, 0, 0 );
   static int count=0;
   using std::string;
   string pid = boost::lexical_cast<string>(getpid());
   string newcorename = "core-"+boost::lexical_cast<string>(count++)+"."+pid;
   string rawcorename = "core."+boost::lexical_cast<string>(childpid);
   int rename_rval = rename(rawcorename.c_str(),newcorename.c_str()); // try with core.PID
   if ( rename_rval == -1 ) rename_rval = rename("core",newcorename.c_str()); // try with just core
   if ( rename_rval == -1 ) std::cerr<<"failed to capture core file\n";

  #if 1 // optional: dump stack trace and delete core file
   string cmd = "( CMDFILE=$(mktemp); echo 'bt' >${CMDFILE}; gdb 2>/dev/null --batch -x ${CMDFILE} "+exename+" "+newcorename+" ; unlink ${CMDFILE} )";
   int system_rval = system( ("bash -c '"+cmd+"'").c_str() );
   if ( system_rval == -1 ) std::cerr.flush(), perror("system() failed during stack trace"), fflush(stderr);
   unlink( newcorename.c_str() );
  #endif

   }

#ifdef NDEBUG
   #define coredump_assert( expression ) ((void)(expression))
#else
   #define coredump_assert( expression ) do { if ( !(expression) ) { coredump( #expression ); } } while (0)
#endif

void recursive( int i=0 )
   {
   coredump_assert( i < 2 );
   if ( i < 4 ) recursive(i+1);
   }

int main( int argc, char * argv[] )
   {
   exename = argv[0]; // this is used to generate the stack trace
   recursive();
   }

When I run the program, it displays three stack traces...

Core was generated by `./temp.exe'.                                         
Program terminated with signal 6, Aborted.
[New process 24251]
#0  0x00007f2818ac9fb5 in raise () from /lib/libc.so.6
#0  0x00007f2818ac9fb5 in raise () from /lib/libc.so.6
#1  0x00007f2818acbbc3 in abort () from /lib/libc.so.6
#2  0x0000000000401a0e in coredump (expression=0x403303 "i < 2") at ./demo3.cpp:29
#3  0x0000000000401f5f in recursive (i=2) at ./demo3.cpp:60
#4  0x0000000000401f70 in recursive (i=1) at ./demo3.cpp:61
#5  0x0000000000401f70 in recursive (i=0) at ./demo3.cpp:61
#6  0x0000000000401f8b in main (argc=1, argv=0x7fffc229eb98) at ./demo3.cpp:66
Core was generated by `./temp.exe'.
Program terminated with signal 6, Aborted.
[New process 24259]
#0  0x00007f2818ac9fb5 in raise () from /lib/libc.so.6
#0  0x00007f2818ac9fb5 in raise () from /lib/libc.so.6
#1  0x00007f2818acbbc3 in abort () from /lib/libc.so.6
#2  0x0000000000401a0e in coredump (expression=0x403303 "i < 2") at ./demo3.cpp:29
#3  0x0000000000401f5f in recursive (i=3) at ./demo3.cpp:60
#4  0x0000000000401f70 in recursive (i=2) at ./demo3.cpp:61
#5  0x0000000000401f70 in recursive (i=1) at ./demo3.cpp:61
#6  0x0000000000401f70 in recursive (i=0) at ./demo3.cpp:61
#7  0x0000000000401f8b in main (argc=1, argv=0x7fffc229eb98) at ./demo3.cpp:66
Core was generated by `./temp.exe'.
Program terminated with signal 6, Aborted.
[New process 24267]
#0  0x00007f2818ac9fb5 in raise () from /lib/libc.so.6
#0  0x00007f2818ac9fb5 in raise () from /lib/libc.so.6
#1  0x00007f2818acbbc3 in abort () from /lib/libc.so.6
#2  0x0000000000401a0e in coredump (expression=0x403303 "i < 2") at ./demo3.cpp:29
#3  0x0000000000401f5f in recursive (i=4) at ./demo3.cpp:60
#4  0x0000000000401f70 in recursive (i=3) at ./demo3.cpp:61
#5  0x0000000000401f70 in recursive (i=2) at ./demo3.cpp:61
#6  0x0000000000401f70 in recursive (i=1) at ./demo3.cpp:61
#7  0x0000000000401f70 in recursive (i=0) at ./demo3.cpp:61
#8  0x0000000000401f8b in main (argc=1, argv=0x7fffc229eb98) at ./demo3.cpp:66
Smokestack answered 18/1, 2011 at 8:49 Comment(3)
While calling system() from application, don't you afraid of some side effects? In my case I'm talking about multi threaded application which consumes reasonable amount of resident memory.Misshapen
There is a problem here for multi-threaded programs: There is a race condition when it is writing and renaming the core file -- two different threads might use the same core file name at the same time. You could add a mutex to address this.Smokestack
@dimba: Could you be more specific about the kinds of side effects that might occur when calling system()? I'm not aware of any such problems.Smokestack
S
0

AFAICS all of the solutions provided so far won't print functions names and line numbers from shared libraries. That's what I needed, so i altered karlphillip's solution (and some other answer from a similar question) to resolve shared library addresses using /proc/id/maps.

#include <stdlib.h>
#include <inttypes.h>
#include <stdio.h>
#include <string.h>
#include <execinfo.h>
#include <stdbool.h>

struct Region { // one mapped file, for example a shared library
    uintptr_t start;
    uintptr_t end;
    char* path;
};

static struct Region* getRegions(int* size) { 
// parse /proc/self/maps and get list of mapped files 
    FILE* file;
    int allocated = 10;
    *size = 0;
    struct Region* res;
    uintptr_t regionStart = 0x00000000;
    uintptr_t regionEnd = 0x00000000;
    char* regionPath = "";
    uintmax_t matchedStart;
    uintmax_t matchedEnd;
    char* matchedPath;

    res = (struct Region*)malloc(sizeof(struct Region) * allocated);
    file = fopen("/proc/self/maps", "r");
    while (!feof(file)) {
        fscanf(file, "%jx-%jx %*s %*s %*s %*s%*[ ]%m[^\n]\n",  &matchedStart, &matchedEnd, &matchedPath);
        bool bothNull = matchedPath == 0x0 && regionPath == 0x0;
        bool similar = matchedPath && regionPath && !strcmp(matchedPath, regionPath);
        if(bothNull || similar) {
            free(matchedPath);
            regionEnd = matchedEnd;
        } else {
            if(*size == allocated) {
                allocated *= 2;
                res = (struct Region*)realloc(res, sizeof(struct Region) * allocated);
            }

            res[*size].start = regionStart;
            res[*size].end = regionEnd;
            res[*size].path = regionPath;
            (*size)++;
            regionStart = matchedStart;
            regionEnd = matchedEnd;
            regionPath = matchedPath;
        }
    }
    return res;
}

struct SemiResolvedAddress {
    char* path;
    uintptr_t offset;
};
static struct SemiResolvedAddress semiResolve(struct Region* regions, int regionsNum, uintptr_t address) {
// convert address from our address space to
// address suitable fo addr2line 
    struct Region* region;
    struct SemiResolvedAddress res = {"", address};
    for(region = regions; region < regions+regionsNum; region++) {
        if(address >= region->start && address < region->end) {
            res.path = region->path;
            res.offset = address - region->start;
        }
    }
    return res;
}

void printStacktraceWithLines(unsigned int max_frames)
{
    int regionsNum;
    fprintf(stderr, "stack trace:\n");

    // storage array for stack trace address data
    void* addrlist[max_frames+1];

    // retrieve current stack addresses
    int addrlen = backtrace(addrlist, sizeof(addrlist) / sizeof(void*));
    if (addrlen == 0) {
        fprintf(stderr, "  <empty, possibly corrupt>\n");
        return;
    }
    struct Region* regions = getRegions(&regionsNum); 
    for (int i = 1; i < addrlen; i++)
    {
        struct SemiResolvedAddress hres =
                semiResolve(regions, regionsNum, (uintptr_t)(addrlist[i]));
        char syscom[256];
        sprintf(syscom, "addr2line -C -f -p -a -e %s 0x%jx", hres.path, (intmax_t)(hres.offset));
        system(syscom);
    }
    free(regions);
}
Salazar answered 26/7, 2019 at 17:12 Comment(1)
This assumes the program addresses in the EXE count from 0 (VMAbase = 0), which isn't true for some linkers. See my post.Systematology
A
0

C++23 <stacktrace>

Finally, this has arrived! More details/comparison with other systems at: print call stack in C or C++

stacktrace.cpp

#include <iostream>
#include <stacktrace>

void my_func_2(void) {
    std::cout << std::stacktrace::current(); // Line 5
}

void my_func_1(double f) {
    (void)f;
    my_func_2(); // Line 10
}

void my_func_1(int i) {
    (void)i;
    my_func_2(); // Line 15
}

int main(int argc, char **argv) {
    my_func_1(1);   // Line 19
    my_func_1(2.0); // Line 20
}

GCC 12.1.0 from Ubuntu 22.04 does not have support compiled in, so for now I built it from source as per: How to edit and re-build the GCC libstdc++ C++ standard library source? and set --enable-libstdcxx-backtrace=yes, and it worked!

Compile and run:

g++ -ggdb3 -O2 -std=c++23 -Wall -Wextra -pedantic -o stacktrace.out stacktrace.cpp -lstdc++_libbacktrace
./stacktrace.out

Output:

   0# my_func_2() at /home/ciro/stacktrace.cpp:5
   1# my_func_1(int) at /home/ciro/stacktrace.cpp:15
   2#      at :0
   3#      at :0
   4#      at :0
   5# 
   0# my_func_2() at /home/ciro/stacktrace.cpp:5
   1# my_func_1(double) at /home/ciro/stacktrace.cpp:10
   2#      at :0
   3#      at :0
   4#      at :0
   5#

The trace is not perfect (missing main line) because of optimization I think. With -O0 it is better:

   0# my_func_2() at /home/ciro/stacktrace.cpp:5
   1# my_func_1(int) at /home/ciro/stacktrace.cpp:15
   2#      at /home/ciro/stacktrace.cpp:19
   3#      at :0
   4#      at :0
   5#      at :0
   6# 
   0# my_func_2() at /home/ciro/stacktrace.cpp:5
   1# my_func_1(double) at /home/ciro/stacktrace.cpp:10
   2#      at /home/ciro/stacktrace.cpp:20
   3#      at :0
   4#      at :0
   5#      at :0
   6# 

I don't know why the name main is missing, but the line is there.

The "extra" lines after main like:

   3#      at :0
   4#      at :0
   5#      at :0
   6# 

are probably stuff that runs before main and that ends up calling main: What happens before main in C++?

Acquit answered 23/11, 2022 at 16:35 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.