While checking Jake van der Plas' "Python Data Science Handbook", I was recreating the usage examples of various debugging and profiling tools. He provides an example for demonstrating %mprun
with the following function:
def sum_of_lists(N):
total = 0
for i in range(5):
L = [j ^ (j >> i) for j in range(N)]
total += sum(L)
del L
return total
I proceeded to execute it in a Jupyter notebook, and got the following output:
Line # Mem usage Increment Line Contents
================================================
1 81.3 MiB 81.3 MiB def sum_of_lists(N):
2 81.3 MiB 0.0 MiB total = 0
3 81.3 MiB 0.0 MiB for i in range(5):
4 113.2 MiB -51106533.7 MiB L = [j ^ (j >> i) for j in range(N)]
5 119.1 MiB 23.5 MiB total += sum(L)
6 81.3 MiB -158.8 MiB del L
7 81.3 MiB 0.0 MiB return total
... which immediately struck me as odd. According to the book, I should have gotten a 25.4 MiB increase on line 4, and a corresponding negative increase on line 6. But I have a massive negative increment instead, which does not line up at all to what I would have expected to happen. According to line 6, there should be a 158.8 increment.
On the other hand, Mem usage
paints a more sensible picture (113.2 - 81.3 = 31.9 MiB increase). So I'm left with a weird, giant negative increment, and two measured changes in memory usage that don't agree with each other. What is going on, then?
Just to check if there's something truly bizarre going on with my interpreter/profiler, I went ahead and replicated the example given in this answer, and got this output:
Line # Mem usage Increment Line Contents
================================================
2 86.5 MiB 86.5 MiB def my_func():
3 94.1 MiB 7.6 MiB a = [1] * (10 ** 6)
4 246.7 MiB 152.6 MiB b = [2] * (2 * 10 ** 7)
5 94.1 MiB -152.6 MiB del b
6 94.1 MiB 0.0 MiB return a
Nothing wrong there, I think. What could be going on with the previous example?
memory_profiler
on some code of their own. Unfortunately, I do not have an explanation (yet). – Ute