What causes failure to unwind in a DWARF perf call stack?
Asked Answered
D

1

9

In backtraces produced by perf record --call-graph dwarf and printed by perf script, I am consistently getting wrong addresses for maybe 5% of call stacks, i.e. unwinding fails. One example is

my_bin 770395 705462.825887:    3560360 cycles: 
        7f0398b9b7e2 __vsnprintf_internal+0x12 (/usr/lib/x86_64-linux-gnu/libc-2.32.so)
        7ffcdb6fbfdf [unknown] ([stack])

my_bin 770395 705462.827040:    3447195 cycles: 
        7f0398ba1624 __GI__IO_default_xsputn+0x104 (inlined)
        7ffcdb6fb7af [unknown] ([stack])

and it was produced from this code (compiled with g++ -O3 -g -fno-omit-frame-pointer my_bin.cpp -o my_bin):

#include <cstdio>
#include <iostream>

int __attribute__ ((noinline)) libc_string(int x) {
    char buf[64] = {0};
    // Some nonsense workload using libc
    int res = 0;
    for (int i = 0; i < x; ++i) {
        res += snprintf(buf, 64, "%d %d %d Can I bring my friends to tea?", (i%10), (i%3)*10, i+2);
        res = res % 128;
    }
    return res;
}

int main() {
    int result = libc_string(20000000);
    std::cout << result << "\n";
}

I'm pretty sure that my program should not have executable code in the stack, so these addresses seem wrong. It's not only one program, but most programs that I've profiled which have about 5% wrong call stacks. These call stacks mostly just have two stack frames, with the innermost one sometimes in libraries like Eigen (even though they generally have correct calls stacks) and sometimes in the C++ STL or libc. I've seen the unwinding end up in unknown, [stack], [heap], anon, //anon, libstdc++.so.6.0.28, or <my_bin>.

I've seen this on Ubuntu 18.04, 20.04 and 20.10.

This happens only with DWARF unwinding. How can this be fixed?

Donofrio answered 18/6, 2021 at 14:15 Comment(2)
Have you tried to break with gdb at one of those functions and see how the backtrace looks like?Escarpment
@JorgeBellon Yes, it always looks correct.Donofrio
M
4

Which other types of unwinding did you test?

In examples I disabled kernel ASLR feature with setarch x86_64 -R for more stable addresses and smaller perf.data files.

Usage of perf record option -e cycles:u command may help, as it will not include kernel samples.

I have reproduction of similar dwarf unwind issue for files generated with default perf record event ('-e cycles:u'; libc-2.31 from libc6-prof package was used) for __GI__IO_default_xsputn (inlined) function:

env LD_LIBRARY_PATH=/lib/libc6-prof/x86_64-linux-gnu setarch `uname -m` -R perf record --call-graph dwarf -o perf.data.dwarf.u -e cycles:u ./my_bin
perf script -i perf.data.dwarf.u |less

Incorrect sample:

my_bin 28100 760107.271010:     461418 cycles:u: 
            7ffff7c74f06 __GI__IO_default_xsputn+0x106 (inlined)
            7ffff7c59c6c __vfprintf_internal+0xf4c (/usr/lib/libc6-prof/x86_64-linux-gnu/libc-2.31.so)

Correct sample:

my_bin 28100 760107.257283:     267268 cycles:u: 
            7ffff7c74eff __GI__IO_default_xsputn+0xff (inlined)
            7ffff7c59c6c __vfprintf_internal+0xf4c (/usr/lib/libc6-prof/x86_64-linux-gnu/libc-2.31.so)
            7ffff7c6e9f6 __vsnprintf_internal+0xb6 (/usr/lib/libc6-prof/x86_64-linux-gnu/libc-2.31.so)
            7ffff7d14a2c ___snprintf_chk+0x9c (inlined)
            555555555314 libc_string+0xb4 (/home/user/so/my_bin)
            555555555314 libc_string+0xb4 (/home/user/so/my_bin)
            555555555111 main+0x11 (/home/user/so/my_bin)
            7ffff7c040fa __libc_start_main+0x10a (/usr/lib/libc6-prof/x86_64-linux-gnu/libc-2.31.so)
            55555555519d _start+0x2d (/home/user/so/my_bin)

For correct unwind I have many different offsets in __GI__IO_default_xsputn+ (the number after +):

perf script -i perf.data.dwarf.u ||grep vsnprintf_internal -B3 |grep _GI__IO_default_xsputn|cut -d + -f 2|sort | uniq -c
...
    208 0x0 (inlined)
     45 0x101 (inlined)
      2 0x105 (inlined)
     91 0x10 (inlined)
    294 0x110 (inlined)
      2 0x117 (inlined)
      2 0x11d (inlined)
    326 0x121 (inlined)

But I have no +0x106 address with correct unwind. And all incorrect unwinds have +0x106 address. Let's check with gdb (it is easier after ASLR disabling; +262 is +0x106):

env LD_LIBRARY_PATH=/lib/libc6-prof/x86_64-linux-gnu setarch `uname -m` -R  gdb -q ./my_bin
(gdb) start
(gdb) x/i 0x7ffff7c74f06
   0x7ffff7c74f06 <__GI__IO_default_xsputn+262>:    retq  
(gdb) disassemble __GI__IO_default_xsputn
Dump of assembler code for function __GI__IO_default_xsputn:
...
   0x00007ffff7c74f05 <+261>:   pop    %rbp
   0x00007ffff7c74f06 <+262>:   retq   

Unwind issue seems to be connected with inlined(?) functions sampled at retq instruction or after pop %rbp?

Maxma answered 3/7, 2021 at 11:14 Comment(7)
To debug the parsing you may check output of perf script -D, it will dump raw sample data including registers and copy of stack data of default 8192 byte size. Correct and incorrect unwind samples may have some differences.Maxma
Thanks, this works for me and is a big step towards understanding the problem. Ideally I wouldn't need to exclude kernel samples, and I noticed that strangely, cycles:k produces good call stacks as well (which are not visible in the combined cycles). So the mystery to me is why cycles != cycles:u + cycles:k.Donofrio
cycles:k should have no user-space stacks, as I understand. I have incorrect call stack parsing now even with cycles:u, they looks like cycles:u: / __GI__IO_default_xsputn+0x106 (inlined) / __vfprintf_internal(libc6-prof) / NOTHING, so the issue is in some other place. There are also perf sources easier to recompile (make perf-src-tar-gz from linux kernel source dir or download from mirrors.edge.kernel.org/pub/linux/kernel/tools/perf) and to debug dwarf parsing of stack copies.Maxma
@nnnmmm, I updated the answer and I still have unwind failures. Could you please check with gdb is there always retq instruction at IP sample for dwarf unwind failures or pop %rbp just before sampled IP? Recompilation of libc6 with saving of temporary asm files (-save-temps) for related object files may help to find the dwarf issues for instructions touching %rbp.Maxma
My current status is that I get incorrect samples only when I don't specify an event, or when I specify cycles:pp. With cycles or cycles:u I'm getting clean stacks, and with cycles:k something different entirely. So I'm not sure if we're seeing the same things here, since you have problems with :u and I don't.Donofrio
And no, the incorrect samples (with no event specified) do not have any special instruction at the IP – no retq or pop %rbp before.Donofrio
@nnnmmm, what is your cpu vendor and generation (or just exact model)? There can be some conflict between dwarf stack parser and too exact IP data (zero skid) from PEBS/IBS (when ':p' or ':pp' or ':ppp' modifiers used on event). Kernel saves stack and registers on sample event from PMU, but :p changes some registers in the sample to zero skid IP using PEBS/IBS.Maxma

© 2022 - 2024 — McMap. All rights reserved.