I used to use a nice Apple profiler that is built into the System Monitor application. As long as your C++ code was compiled with debug information, you could sample your running application and it would print out an indented tree telling you what percent of the parent function's time was spent in this function (and the body vs. other function calls).
For instance, if main called function_1
and function_2
, function_2
calls function_3
, and then main calls function_3
:
main (100%, 1% in function body):
function_1 (9%, 9% in function body):
function_2 (90%, 85% in function body):
function_3 (100%, 100% in function body)
function_3 (1%, 1% in function body)
I would see this and think, "Something is taking a long time in the code in the body of function_2
. If I want my program to be faster, that's where I should start."
How can I most easily get this exact profiling output for a Python program?
I've seen people say to do this:
import cProfile, pstats
prof = cProfile.Profile()
prof = prof.runctx("real_main(argv)", globals(), locals())
stats = pstats.Stats(prof)
stats.sort_stats("time") # Or cumulative
stats.print_stats(80) # 80 = how many to print
But it's quite messy compared to that elegant call tree. Please let me know if you can easily do this, it would help quite a bit.