Understanding ruby-prof output
Asked Answered
D

1

61

I ran ruby-profiler on one of my programs. I'm trying to figure out what each fields mean. I'm guessing everything is CPU time (and not wall clock time), which is fantastic. I want to understand what the "---" stands for. Is there some sort of stack information in there. What does calls a/b mean?

Thread ID: 81980260
Total Time: 0.28

  %total   %self     total      self      wait     child            calls   Name
--------------------------------------------------------------------------------
                      0.28      0.00      0.00      0.28              5/6     FrameParser#receive_data
 100.00%   0.00%      0.28      0.00      0.00      0.28                6     FrameParser#read_frames
                      0.28      0.00      0.00      0.28              4/4     ChatServerClient#receive_frame
                      0.00      0.00      0.00      0.00             5/47     Fixnum#+
                      0.00      0.00      0.00      0.00              1/2     DebugServer#receive_frame
                      0.00      0.00      0.00      0.00            10/29     String#[]
                      0.00      0.00      0.00      0.00            10/21     <Class::Range>#allocate
                      0.00      0.00      0.00      0.00            10/71     String#index
--------------------------------------------------------------------------------
 100.00%   0.00%      0.28      0.00      0.00      0.28                5     FrameParser#receive_data
                      0.28      0.00      0.00      0.28              5/6     FrameParser#read_frames
                      0.00      0.00      0.00      0.00             5/16     ActiveSupport::CoreExtensions::String::OutputSafety#add_with_safety
--------------------------------------------------------------------------------
                      0.28      0.00      0.00      0.28              4/4     FrameParser#read_frames
 100.00%   0.00%      0.28      0.00      0.00      0.28                4     ChatServerClient#receive_frame
                      0.28      0.00      0.00      0.28              4/6     <Class::Lal>#safe_call
--------------------------------------------------------------------------------
                      0.00      0.00      0.00      0.00              1/6     <Class::Lal>#safe_call
                      0.00      0.00      0.00      0.00              1/6     DebugServer#receive_frame
                      0.28      0.00      0.00      0.28              4/6     ChatServerClient#receive_frame
 100.00%   0.00%      0.28      0.00      0.00      0.28                6     <Class::Lal>#safe_call
                      0.21      0.00      0.00      0.21              2/4     ChatUserFunction#register
                      0.06      0.00      0.00      0.06              2/2     ChatUserFunction#packet
                      0.01      0.00      0.00      0.01            4/130     Class#new
                      0.00      0.00      0.00      0.00              1/1     DebugServer#profile_stop
                      0.00      0.00      0.00      0.00             1/33     String#==
                      0.00      0.00      0.00      0.00              1/6     <Class::Lal>#safe_call
                      0.00      0.00      0.00      0.00              5/5     JSON#parse
                      0.00      0.00      0.00      0.00              5/8     <Class::Log>#log
                      0.00      0.00      0.00      0.00              5/5     String#strip!
--------------------------------------------------------------------------------
Diplomat answered 20/3, 2011 at 3:17 Comment(3)
I would just like to point out that if you run under the debugger, interrupt it manually, and display the call stack, 4 times, on average 3 of those samples will be this: read_frames -> receive_frame -> safe_call -> register -> ... so your bottleneck will be be shouting to you for attention. That's this method. The remaining 25% of the time is mostly ... -> packet -> .... Everything else is basically irrelevant.Tananarive
@Mike, It would be helpful if you communicated how you interpret the graph to draw that conclusion, as I think that's what this question is trying to get at (how to interpret this output). It would be useful for me!Tildi
@Charles: I'm looking at the "total" column, which is amount of time spent on-stack. The first bunch of lines is about read_frames. All of its time is spent calling receive_frame. The third bunch of lines is about receive_frame, which spends all its time in safe_call. The fourth bunch says safe_call spends most of its time in register, then packet, and a little bit in new. (The second bunch says receive_data is always at the top.)Tananarive
L
69

Each section of the ruby-prof output is broken up into the examination of a particular function. for instance, look at the first section of your output. The read_frames method on FrameParser is the focus and it is basically saying the following:

  • 100% of the execution time that was profiled was spent inside of FrameParser#read_frames
  • FrameParser#read_frames was called 6 times.
  • 5 out of the 6 calls to read_frames came from FrameParser#receive_data and this accounted 100% of the execution time (this is the line above the read_frames line).
  • The lines below the read_frames (but within that first section) method are all of the methods that FrameParser#read_frames calls (you should be aware of that since this seems like it's your code), how many of that methods total calls read_frames is responsible for (the a/b calls column), and how much time those calls took. They are ordered by which of them took up the most execution time. In your case, that is receive_frame method on the ChatServer class.
  • You can then look down at the section focusing on receive_frames (2 down and centered with the '100%' line on receive_frame) and see how it's performance is broken down. each section is set up the same way and usually the subsequent function call which took the most time is the focus of the next section down. ruby-prof will continue doing this through the full call stack. You can go as deep as you want until you find the bottleneck you'd like to resolve.
Lexicology answered 20/3, 2011 at 3:25 Comment(2)
The repo moved, and that link is now available at github.com/ruby-prof/ruby-prof/blob/master/examples/graph.txtHorary
@MarkRushakoff new graph location ruby-prof.github.io/examples/primes.graph.htmlDatcha

© 2022 - 2024 — McMap. All rights reserved.