Sort cProfile output by percall when profiling a Python script
Asked Answered
S

2

113

I'm using python -m cProfile -s calls myscript.py

python -m cProfile -s percall myscript.py does not work.

The Python documentation says "Look in the Stats documentation for valid sort values.": http://docs.python.org/library/profile.html#module-cProfile, which I cannot find.

Syllabogram answered 26/4, 2012 at 3:45 Comment(2)
Also have a look at snakeviz: #1593519Perionychium
There's a patch for this which is being ignored unfortunately: bugs.python.org/issue18795Anaheim
O
151

-s only uses the keys found under sort_stats.

calls (call count)
cumulative (cumulative time)
cumtime (cumulative time)
file (file name)
filename (file name)
module (file name)
ncalls (call count)
pcalls (primitive call count)
line (line number)
name (function name)
nfl (name/file/line)
stdname (standard name)
time (internal time)
tottime (internal time)

Here's an example

python -m cProfile -s tottime myscript.py
Octant answered 18/12, 2012 at 0:40 Comment(8)
Just to clear it up, for noobs like me, you don't use the quotes :) So an example to sort methods by time spent in the method is "python -m cProfile -s tottime myscript.py"Predigest
There's nothing wrong with using quotes (at least in bash), is there? The quotes are eaten by bash command preprocessing..Landlubber
As far as windows is concerned, one cannot use the quotes. Also to be noted - sort by time and tottime is one and the same thing.Alter
I want to be able to sort the cprofile output by any of the columns provided. As far as I know, currently it is still not possible to sort it by the column labeled percall.Consonance
What is the percall column? You are able to sort on any column. docs.python.org/3/library/profile.html#pstats.Stats.sort_stats As a last resort you could use the index of the columnFor backward-compatibility reasons, the numeric arguments -1, 0, 1, and 2 are permitted.Octant
@JoshuaOlson your link to the docs.python makes no mention of sorting by percall. It seems odd to me that even though the percall is the most useful sorting column, yet it is not implemented.Durst
I suspect it isn't included because it is a hybrid/derived metric. This seems like a straightforward thing to be able to add. And python is always happy to accept new features if you want to submit a pull request. github.com/python/cpython/blob/main/Lib/pstats.pyOctant
Yes it's computed in Stats.print_line of pstats.py, which prints the information after it has been sorted. But you could also just write your own primitive printer: an instance of Stats has a dict() in the .stats field where the keys are function names. After looking at _lsprof.c it seems the values in the dict() are tuples of: (<num calls>, <num calls including recursive calls>, <time spent in call including subcalls>, <time spent in call excluding subcalls>, <what called this>).Hurst
R
0

As letmaik pointed out, there is an open bug ticket for this capability. However, the submitter (alexnvdias) already did the work for this and posted their code in the original bug ticket and in the github pull request. If you want to be able to use it, you can just subclass pstats.Stats and implement what they already submitted. It works for me just fine.

import pstats

class Stats(pstats.Stats):
    # list the tuple indices and directions for sorting,
    # along with some printable description
    sort_arg_dict_default = {
      "calls"            : (((1,-1),              ), "call count"),
      "ncalls"           : (((1,-1),              ), "call count"),
      "cumtime"          : (((4,-1),              ), "cumulative time"),
      "cumulative"       : (((4,-1),              ), "cumulative time"),
      "file"             : (((6, 1),              ), "file name"),
      "filename"         : (((6, 1),              ), "file name"),
      "line"             : (((7, 1),              ), "line number"),
      "module"           : (((6, 1),              ), "file name"),
      "name"             : (((8, 1),              ), "function name"),
      "nfl"              : (((8, 1),(6, 1),(7, 1),), "name/file/line"),
      "pcalls"           : (((0,-1),              ), "primitive call count"),
      "stdname"          : (((9, 1),              ), "standard name"),
      "time"             : (((2,-1),              ), "internal time"),
      "tottime"          : (((2,-1),              ), "internal time"),
      "cumulativepercall": (((5,-1),              ), "cumulative time per call"),
      "totalpercall"     : (((3,-1),              ), "total time per call"),
      }


    def sort_stats(self, *field):
      if not field:
        self.fcn_list = 0
        return self
      if len(field) == 1 and isinstance(field[0], int):
        # Be compatible with old profiler
        field = [ {-1: "stdname",
                    0:  "calls",
                    1:  "time",
                    2:  "cumulative"}[field[0]] ]
      elif len(field) >= 2:
        for arg in field[1:]:
          if type(arg) != type(field[0]):
            raise TypeError("Can't have mixed argument type")

      sort_arg_defs = self.get_sort_arg_defs()

      sort_tuple = ()
      self.sort_type = ""
      connector = ""
      for word in field:
        if isinstance(word, pstats.SortKey):
          word = word.value
        sort_tuple = sort_tuple + sort_arg_defs[word][0]
        self.sort_type += connector + sort_arg_defs[word][1]
        connector = ", "
  
      stats_list = []
      for func, (cc, nc, tt, ct, callers) in self.stats.items():
        if nc == 0:
          npc = 0
        else:
          npc = float(tt)/nc

        if cc == 0:
          cpc = 0
        else:
          cpc = float(ct)/cc

        stats_list.append((cc, nc, tt, npc, ct, cpc) + func +
                          (pstats.func_std_string(func), func))

      stats_list.sort(key=pstats.cmp_to_key(pstats.TupleComp(sort_tuple).compare))

      self.fcn_list = fcn_list = []
      for tuple in stats_list:
        fcn_list.append(tuple[-1])
      return self

s = Stats('test.dat')
s.sort_stats('totalpercall')
s.print_stats()

Result:

Mon Apr 29 12:50:37 2024    test.dat

     2003 function calls in 0.058 seconds

Ordered by: total time per call

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    1    0.001    0.001    0.058    0.058 test.py:2(<module>)
 1000    0.029    0.000    0.029    0.000 test.py:2(add)
 1000    0.028    0.000    0.028    0.000 test.py:8(sub)
    1    0.000    0.000    0.058    0.058 {built-in method builtins.exec}
    1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
Rochellerochemont answered 29/4 at 18:48 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.