callgrind slow with instrumentation turned off
Asked Answered
E

1

9

I am using callgrind to profile a linux multi-threaded app and mostly it's working great. I start it with instrumentation off (--instr-atstart=no) and then once setup is done i turn it on with callgrind_control -i on. However, when I change certain configurations to try to profile a different part of the app it starts running extremely slow even before I turn instrumentation on. Basically part of the code that would take a few seconds with normal operation takes over an hour with callgrind (instrumentation turned off). Any ideas as to why that might be and how to go about debugging/resolving the slowness?

Elvyn answered 27/1, 2012 at 19:30 Comment(4)
What are the "certain configurations to try to profile a different part of the app"?Brien
user779, can you check a speed of application with the "nul" tool of valgrind and with Lackey tool of valgrind?Thickskinned
@jpalecek: all I mean is that users can enable/disable features by config and by enabling some of the features (it will recursively drill in for more details on a set of objects and that result in a lot more computation) it starts crawling.Elvyn
@osgx: I just tried and I see the same slowness with nulgrind tooElvyn
T
11

Callgrind is a tool, built on valgrind. Valgrind is basically a dynamic binary translator (libVEX, part of valgrind). It will decode every instruction and JIT-compile them into stream of some instructions of the same CPU.

As I know, there is no way to enable this translation (in valgrind implementation) for already running process, so dynamic translation is enabled all time, from start of program. It can't be turned off too.

Tools are built on valgrind by adding some instrumentation code. The "Nul" tool (nulgrind) is the tool which adds no instrumentation. But every tool uses valgrind and dynamic translation is active all time. Turning on and off in callgrind is just turning on and off additional instrumentation.

Virtual CPU, implemented by Valgrind is limited, there is (incomplete) list of limitations http://valgrind.org/docs/manual/manual-core.html#manual-core.limits Most of limitations are about floating point operations, and they can be emulated wrong.

Is the change connected with floating-point operations? Or with other listed limitations?

Also you should know, that "Valgrind serialises execution so that only one thread is running at a time". (from the same page manual-core.html)

Thickskinned answered 9/2, 2012 at 19:38 Comment(1)
PS: nulgrind (libVEX basic instrumentation) overhead is huge. Nulgrind estimated to be 2-10 times slower than native code (for example in os.inf.tu-dresden.de/papers_ps/vee08-pohle.pdf); any other tool on vlagrind is slower than nulgrind. Callgrind with turned off mode will run at speed of nulgrind; callgrind turned on will run several times slower.Thickskinned

© 2022 - 2024 — McMap. All rights reserved.