How to get a function-duration breakdown in go (profiling)
Asked Answered
S

1

5

Update (Jan 24, 2019):

This question was asked 4 years ago about Go 1.4 (and is still getting views). Profiling with pprof has changed dramatically since then.

Original Question:

I'm trying to profile a go martini based server I wrote, I want to profile a single request, and get the complete breakdown of the function with their runtime duration. I tried playing around with both runtime/pprof and net/http/pprof but the output looks like this:

Total: 3 samples
       1  33.3%  33.3%        1  33.3% ExternalCode
       1  33.3%  66.7%        1  33.3% runtime.futex
       1  33.3% 100.0%        2  66.7% syscall.Syscall

The web view is not very helpful either.

We regularly profile another program, and the output seems to be what I need:

20ms of 20ms total (  100%)
      flat  flat%  sum%        cum  cum%
      10ms 50.00% 50.00%     10ms 50.00%  runtime.duffcopy
      10ms 50.00%   100%     10ms 50.00%  runtime.fastrand1
         0     0%   100%     20ms   100%  main.func·004
         0     0%   100%     20ms   100%  main.pruneAlerts
         0     0%   100%     20ms   100%  runtime.memclr

I can't tell where the difference is coming from.

Sonny answered 8/12, 2014 at 17:19 Comment(7)
what OS are you using to profile?Neva
Vagrant. I saw some debating whether it matters, but no conclusion anywhere. We have a staging environment I can use, but do I go about the same approach?Sonny
Vagrant isn't an OS. You really need to profile on Linux, but there is a kernel patch here to make it work on darwin. I've also still seen results like this using LInux in VirtualBox on OSX (not sure if the patch helps that situation or not)Neva
You're right of course, what I meant was I could set up whatever I wanted if you had any suggestions - Ubuntu trusty. I have an older raring box ready too I can play with.Sonny
Oh, I missed part of your question -- pprof is a sampling profiler; it doesn't trace execution. You need to profile for a reasonable amount of time under load to get a representative set of samples.Neva
Can you please elaborate? A colleague is running pprof. StartCPUProfile on a different program and is getting the breakdown in the profile.Sonny
I updated the question and added the output from the other profile.Sonny
N
7

pprof is a timer based sampling profiler, originally from the gperftools suite. Rus Cox later ported the pprof tools to Go: http://research.swtch.com/pprof.

This timer based profiler works by using the system profiling timer, and recording statistics whenever it receives SIGPROF. In go, this is currently set to a constant 100Hz. From pprof.go:

// The runtime routines allow a variable profiling rate,
// but in practice operating systems cannot trigger signals
// at more than about 500 Hz, and our processing of the
// signal is not cheap (mostly getting the stack trace).
// 100 Hz is a reasonable choice: it is frequent enough to
// produce useful data, rare enough not to bog down the
// system, and a nice round number to make it easy to
// convert sample counts to seconds.  Instead of requiring
// each client to specify the frequency, we hard code it.
const hz = 100

You can set this frequency by calling runtime.SetCPUProfileRate and writing the profile output yourself, and Gperftools allows you to set this frequency with CPUPROFILE_FREQUENCY, but in practice it's not that useful.

In order to sample a program, it needs to be doing what you're trying to measure at all times. Sampling the idle runtime isn't showing anything useful. What you usually do is run the code you want in a benchmark, or in a hot loop, using as much CPU time as possible. After accumulating enough samples, there should be a sufficient number across all functions to show you proportionally how much time is spent in each function.

See also:

Neva answered 9/12, 2014 at 15:12 Comment(3)
I think I understand what you are saying, and it sounds reasonable that pprof is simply not the tool for me. I'm now just curious as to why I am getting the profile in a different format.Sonny
You have to show us what exactly was run to get the output you have. You also may be using different versions of the pprof tool.Neva
Oh, and for completeness, there is work being done on an execution tracer, which may land in go1.5. This could eventually give you want, and more (all in a nice visual format).Neva

© 2022 - 2024 — McMap. All rights reserved.