How to get a callstack from oprofile output?
Asked Answered
T

1

7

I'm so confused. I don't know if oprofile can even provide a stack trace from a profiling report. I've been reviewing the oprofile manual and it only refers to stacktraces by saying that they can be logged, but it doesn't give an example for how to do so.

Here's my test.cpp

#include <iostream>                              
#include <unistd.h>                              
using namespace std;                             

void test(){                                     
    for (int x = 0; x < 100000; x++) cout << ".";
    sleep(1);                                    
    cout << endl;                                
};                                               

int main(int argv, char** argc){                 
    for (int x = 0; x < 120; x++) test();        
    return 0;                                    
}                                                

Here's the command I used to compile it:

g++ -g -Wall test.cpp -o test

And, here's my perf.sh script (running on RHEL 6.2 in a VM):

#!/bin/bash -x
sudo opcontrol --no-vmlinux                                                 
sudo opcontrol --reset                                                      
sudo opcontrol --start --separate=library,thread --image=$HOME/test
sudo opcontrol --callgraph=10                                               
sudo opcontrol --status                                                     
read -p "Press [Enter] key to stop profiling"                                                                       
sudo opcontrol --dump || exit 1                                             
sudo opreport --demangle=smart \                                            
              --merge=all \                                                 
              --symbols \                                                   
              --callgraph \                                                 
              --global-percent \                                            
              --output-file=perf.out                                        
sudo opcontrol --shutdown                                                   
sudo opcontrol --reset                                                      

Here's the report that I'm getting, at this time:

CPU: CPU with timer interrupt, speed 0 MHz (estimated)                            
Profiling through timer interrupt                                                 
samples  %        app name                 symbol name                            
-------------------------------------------------------------------------------   
14       43.7500  libstdc++.so.6.0.13      /usr/lib64/libstdc++.so.6.0.13         
  14       43.7500  libstdc++.so.6.0.13      /usr/lib64/libstdc++.so.6.0.13 [self]
-------------------------------------------------------------------------------   
11       34.3750  libc-2.12.so             fwrite                                 
  11       34.3750  libc-2.12.so             fwrite [self]                        
-------------------------------------------------------------------------------   
5        15.6250  libc-2.12.so             _IO_file_xsputn@@GLIBC_2.2.5           
  5        15.6250  libc-2.12.so             _IO_file_xsputn@@GLIBC_2.2.5 [self]  
-------------------------------------------------------------------------------   
2         6.2500  libc-2.12.so             __strlen_sse42                         
  2         6.2500  libc-2.12.so             __strlen_sse42 [self]                
-------------------------------------------------------------------------------   

And, my question: How can I get stack traces to show up in the profiling report?

Timmy answered 21/6, 2013 at 18:18 Comment(3)
Good question. That bit of doc is not very clear. It's apologizing for not showing call counts, which are irrelevant under sampling. You should see 100% broken down into basically two parts. One would be main:12 -> test:7 -> sleep -> <system routines> and the other would be main:12 -> test:8 -> cout::endl -> <system IO routines>. I suspect most of it would be in the sleep. I suspect very little would be in the cout << "." unless you were outputting to stderr. Anyway, that's what several stackshots in GDB would show you.Calefactory
Yeah. GDB is a bit more helpful than oprofile was, except that I need to use oprofile in a multithreaded app that's much larger than the test script that I wrote above. So, using GDB isn't going to really work for me. After talking with others, I feel like instrumenting using logging+timers might make more sense.Timmy
Well, here's what I would do: in the many-thread case, when I interrupt it, every thread is halted, so I get a bt on each thread. The ones that are doing nothing, like waiting for input, I ignore. The others are valuable. I'm assuming the goal is to find ways to make the code faster, as opposed to just getting measurements. Maybe that's not your goal. You can talk to others, but the technique is not as well known as it ought to be, so you know what you can expect to hear.Calefactory
A
1

(this is a bit late, but this might help someone else)

Because you're profiling in timer mode (which is the default behavior on some CPUs), the backtracing might be disabled in your kernel (which version appears to be 2.6.32, since you're on RHEL 6.2).

You can try to:

  1. use hardware counters
  2. have a look at the history of the kernel part of oprofile, if there is indeed a limitations in your kernel version, this may have been fixed
  3. update your kernel

I faced the same issue with the same kernel release, but since I'm on ARM, my quick-fix won't work (this is the patch to apply in this case).

Africah answered 15/1, 2014 at 16:2 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.