Callgrind: Profile a specific part of my code
Asked Answered
S

2

19

I'm trying to profile (with Callgrind) a specific part of my code by removing noise and computation that I don't care about. Here is an example of what I want to do:

for (int i=0; i<maxSample; ++i) {
    //Prepare data to be processed...
    //Method to be profiled with these data
    //Post operation on the data
}

My use-case is a regression test, I want to make sure that the method in question is still fast enough (something like less than 10% extra instructions since the last implementation). This is why I'd like to have the cleaner output form Callgrind. (I need a for loop in order to have a significant amount of data processed in order to have a good estimation of the behavior of the method I want to profile)

My first try was to change the code to:

for (int i=0; i<maxSample; ++i) {
    //Prepare data to be processed...
    CALLGRIND_START_INSTRUMENTATION;
    //Method to be profiled with these data
    CALLGRIND_STOP_INSTRUMENTATION;
    //Post operation on the data
}
CALLGRIND_DUMP_STATS;

Adding the Callgrind macros to control the instrumentation. I also added the --instr-atstart=no options to be sure that I profile only the part of the code I want...

Unfortunately with this configuration when I start to launch my executable with callgrind, it never ends... It is not a question of slowness, because a full instrumentation run last less than one minute.

I also tried

for (int i=0; i<maxSample; ++i) {
    //Prepare data to be processed...
    CALLGRIND_TOGGLE_COLLECT;
    //Method to be profiled with these data
    CALLGRIND_TOGGLE_COLLECT;
    //Post operation on the data
}
CALLGRIND_DUMP_STATS;

(or the --toggle-collect="myMethod" option) But Callgrind returned me a log without any call (KCachegrind is white as snow :( and says zero instructions...)

Did I use the macros/options correctly? Any idea of what I need to change in order to get the expected result?

Suppurate answered 3/12, 2012 at 17:5 Comment(0)
S
21

I finally managed to solve this issue... This was a config issue:

I kept the code

for (int i=0; i<maxSample; ++i) {
    //Prepare data to be processed...
    CALLGRIND_TOGGLE_COLLECT;
    //Method to be profiled with these data
    CALLGRIND_TOGGLE_COLLECT;
    //Post operation on the data
}
CALLGRIND_DUMP_STATS;

But ran the callgrind with --collect-atstart=no (and without the --instr-atstart=no!!!) and it worked perfectly, in a reasonable time (~1min).

The issue with START/STOP instrumentation was that callgrind dumps a file (callgrind.out.#number) at each iteration (each STOP) thus it was really really slow... (after 5min I had only 5000 runs for a 300 000 iterations benchmark... unsuitable for a regression test).

Suppurate answered 4/12, 2012 at 10:27 Comment(4)
so you dont start/stop the instrumentation?Tachygraphy
@Tachygraphy start/stop will create new dump every time you call stop. If you want to get the aggregation in one single report, toggle seems to be the way to go (basically it just flips active/inactive instrumentation).Suppurate
thanks for the response @joetde. I asked because I am facing a problem where the callbacks/macros from callgrind or any other tool do not seem to get ever called. More here!Tachygraphy
@Suppurate nice q&a! As a corollary, does callgrind slow down execution for the parts of the code you're not measuring?Waki
H
12

The toggle-collect option is very picky in how you specify the method to use as trigger. You actually need to specify its argument list as well, and even the whitespace needs to match! Use the method name exactly as it appears in the callgrind output. For instance, I am using this invokation:

$ valgrind 
    --tool=callgrind 
    --collect-atstart=no 
    "--toggle-collect=ctrl_simulate(float, int)"
    ./swaag

Please observe:

  • The double quotes around the option.
  • The argument list including parentheses.
  • The whitespace after the comma character.
Hiers answered 1/8, 2017 at 18:29 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.