Which Python memory profiler is recommended? [closed]

B

8

766

I want to know the memory usage of my Python application and specifically want to know what code blocks/portions or objects are consuming most memory. Google search shows a commercial one is Python Memory Validator (Windows only).

And open source ones are PySizer and Heapy.

I haven't tried anyone, so I wanted to know which one is the best considering:

Gives most details.
I have to do least or no changes to my code.

Bridgehead answered 21/9, 2008 at 4:43 Comment(6)

For finding the sources of leaks I recommend objgraph. – Chaqueta 15/11, 2012 at 10:23

@MikeiLL There is a place for questions like these: Software Recommendations – Carrier 5/2, 2015 at 19:12

This is happening often enough that we should be able to migrate one question to another forum instead. – Masterson 11/4, 2016 at 14:53

One tip: If someone use gae to and want's to check memory usage - it's a big headache, because those tools didn't output nothing or event not started. If you want to test something small, move function that you want to test to separate file, and run this file alone. – Hutcheson 22/7, 2016 at 11:34

I recommend pympler – Kite 20/6, 2017 at 13:57

Check out memray – Glennglenna 27/4, 2022 at 12:36

R

313

guppy3 is quite simple to use. At some point in your code, you have to write the following:

from guppy import hpy
h = hpy()
print(h.heap())

This gives you some output like this:

Partition of a set of 132527 objects. Total size = 8301532 bytes.
Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
0  35144  27  2140412  26   2140412  26 str
1  38397  29  1309020  16   3449432  42 tuple
2    530   0   739856   9   4189288  50 dict (no owner)

You can also find out from where objects are referenced and get statistics about that, but somehow the docs on that are a bit sparse.

There is a graphical browser as well, written in Tk.

For Python 2.x, use Heapy.

Rhodic answered 21/9, 2008 at 11:45 Comment(14)

sadly doesn't seem to build or install in osx.. 10.4 at least. – Nihilism 28/8, 2011 at 3:6

It builds on OS X 10.7.1 with homebrew, but sadly doesn't run :-( – Merrimerriam 12/9, 2011 at 0:53

If you're on Python 2.7 you may need the trunk version of it: sourceforge.net/tracker/…, pip install https://guppy-pe.svn.sourceforge.net/svnroot/guppy-pe/trunk/guppy – Dandelion 3/1, 2012 at 20:6

Latest version (0.1.9) builds on Windows for Python 2.6 x64 but h.heap() call causes APPCRASH. – Renatarenate 30/1, 2012 at 5:41

The heapy docs are... not good. But I found this blog post very helpful for getting started: smira.ru/wp-content/uploads/2011/08/heapy.html – Unhappy 13/2, 2012 at 19:58

Heapy is by far the easiest heap profiler to run when attaching to a leaking python process with rfoo and works fine in a multithreaded app, and works nicely with pip using "pip install guppy" Usually the default view works, but hpy offers several views of the profile data including showing you use count by reference. The blog post linked by @JoeShaw is very helpful. – Zaid 4/4, 2012 at 7:46

Note, heapy doesn't include memory allocated in python extensions. If anybody has worked out a mechanism to get heapy to include boost::python objects, it would be nice to see some examples! – Colombi 3/7, 2014 at 18:8

As of 2014-07-06, guppy does not support Python 3. – Relucent 16/7, 2014 at 19:5

@JamesSnyder Looks like the normal pip version (1.10) is now ok with python 2.7 – Waltner 11/6, 2015 at 3:3

Just installed fine with pip (python 2.7). I found that the problem I wanted to use it for (memory use continually increasing) disappears when I call h.heap(). Any ideas why this might be? – Shadow 22/7, 2015 at 13:43

How is knowing that "str" is consuming the most memory in any way useful? That could be one of a million points in the code. Without knowing where those calls are made, the info provided here is useless. – Ideogram 15/10, 2018 at 15:41

There is a fork of guppy that supports Python 3 called guppy3. – Ulrich 22/8, 2019 at 23:0

Where to insert this pofiler code in our existing python code? Should it be at the end/beginning? How do we integrate this profiler code in our existing code for resource usage stats? – Naze 3/12, 2020 at 6:53

My favorite docs for Heapy/guppy3 is the research paper that created it, especially §6.2 "Debugging approach": liu.diva-portal.org/smash/get/diva2:22287/FULLTEXT01 – Ulrich 9/4, 2021 at 12:56

P

501

My module memory_profiler is capable of printing a line-by-line report of memory usage and works on Unix and Windows (needs psutil on this last one). Output is not very detailed but the goal is to give you an overview of where the code is consuming more memory, not an exhaustive analysis on allocated objects.

After decorating your function with @profile and running your code with the -m memory_profiler flag it will print a line-by-line report like this:

Line #    Mem usage  Increment   Line Contents
==============================================
     3                           @profile
     4      5.97 MB    0.00 MB   def my_func():
     5     13.61 MB    7.64 MB       a = [1] * (10 ** 6)
     6    166.20 MB  152.59 MB       b = [2] * (2 * 10 ** 7)
     7     13.61 MB -152.59 MB       del b
     8     13.61 MB    0.00 MB       return a

Postpone answered 14/5, 2012 at 22:51 Comment(18)

For my usecase - a simple image manipulation script, not a complex system, which happened to leave some cursors open - this was the best solution. Very simple to drop in and figure out what's going on, with minimal gunk added to your code. Perfect for quick fixes and probably great for other applications too. – Vernalize 8/4, 2013 at 12:1

This is great. Is there any way to use it to collect memory usage per object? (as opposed to per line). Ideally from an IPython session with objects already in memory. If not, do you have any pointers on something along these lines? – Precondition 20/8, 2013 at 15:28

It doesn't get memory usage of individual objects. For that task, guppy/heapy might be what you want. – Postpone 22/8, 2013 at 6:37

I find memory_profiler to be really simple and easy to use. I want to do profiling per line and not per object. Thanks for writing. – Restitution 8/9, 2013 at 17:27

@FabianPedregosa how dose memory_profiler handle loops, can it identifier loop iteration number? – Walcott 17/6, 2014 at 8:42

It identifies loops only implicitly when it tries to report the line-by-line amount and it finds duplicated lines. In that case it will just take the max of all iterations. – Postpone 17/6, 2014 at 9:15

I have tried to profile memory usage of python application that was using tensorflow in cpu mode depending on input image size and python -m memory_profiler example.py does not give me correct results, and mprof give me results similar to htop. – Slavey 10/11, 2017 at 13:30

Does not seem to perform very well in CPU-intensive programs – Burnham 21/11, 2017 at 10:51

@FabianPedregosa: How to specify the installation path? I want to install it on my another python folder. thx – Basketwork 23/11, 2017 at 10:27

Same way as any other python package, pip install --target=/custom/path memory_profiler – Postpone 27/11, 2017 at 19:50

I have tried memory_profiler but think it is not a good choice. It makes the program execution incredibly slow (approximately in my case as 30 times as slow). – Godly 12/12, 2017 at 12:29

There is a constant overhead (per-line) in tracking memory consumption, so if your program is extremely long or has many fast for/while loops, then I would expect this to slow down significantly your program. In that case, the time-based (opposed to line-based) profiler might be better. This is run as mprof run <script>, see doc for more information. – Postpone 13/12, 2017 at 15:55

memory profiler and heapy solve 2 different cases i guess, one is concerned with memory consumption per line while the other one goes along objects – Tohubohu 7/5, 2018 at 7:5

@FabianPedregosa Does memory_profiler buffer its output? I may be doing something wrong, but it seems that rather than dump the profile for a function when it completes, it waits for the script to end. – Colorfast 30/7, 2018 at 17:51

It does indeed wait until the script finishes. It would not be easy to do otherwise as the function could be called again, in which case memory_profiler will aggregate the results. – Postpone 31/7, 2018 at 18:7

@FabianPedregosa Thanks for so useful and simple library! Though I'm being confused with the output - when I run mprof run test.py and then mprof plot I get different memory usage from line-by-line output vs over-time. Line-by-line I get maximum of 550MiB while from the plot I get maximum of 5000MiB. What can be the problem? Thanks! – Frannie 22/9, 2019 at 14:39

For me memory profiler slowed down execution by roughly a factor 10! Note that I had large objects in the orders of a few GB. Otherwise cool tool. – Gifford 4/1, 2021 at 12:52

This tool is no longer maintained. – Raffo 4/4, 2022 at 21:43

R

313

guppy3 is quite simple to use. At some point in your code, you have to write the following:

from guppy import hpy
h = hpy()
print(h.heap())

This gives you some output like this:

Partition of a set of 132527 objects. Total size = 8301532 bytes.
Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
0  35144  27  2140412  26   2140412  26 str
1  38397  29  1309020  16   3449432  42 tuple
2    530   0   739856   9   4189288  50 dict (no owner)