Matplotlib errors result in a memory leak. How can I free up that memory?
Asked Answered
Q

2

15

I am running a django app that includes matplotlib and allows the user to specify the axes of the graph. This can result in 'Overflow Error: Agg complexity exceeded'

When that happens up to 100MB of RAM get tied up. Normally I free that memory up using fig.gcf(), plot.close(), and gc.collect(), but the memory associated with the error does not seem to be associated with the plot object.

Does anyone know how I can release that memory?

Thanks.

Here is some code that gives me the Agg Complexity Error.

import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np      
import gc

a = np.arange(1000000)
b = np.random.randn(1000000)

fig = plt.figure(num=1, dpi=100, facecolor='w', edgecolor='w')
fig.set_size_inches(10,7)
ax = fig.add_subplot(111)
ax.plot(a, b)

fig.savefig('yourdesktop/random.png')   # code gives me an error here

fig.clf()    # normally I use these lines to release the memory
plt.close()
del a, b
gc.collect()
Quirinus answered 19/8, 2011 at 18:20 Comment(0)
H
15

I assume you can run the code you posted at least once. The problem only manifests itself after running the posted code many times. Correct?

If so, the following avoids the problem without really identifying the source of the problem. Maybe that is a bad thing, but this works in a pinch: Simply use multiprocessing to run the memory-intensive code in a separate process. You don't have to worry about fig.clf() or plt.close() or del a,b or gc.collect(). All memory is freed when the process ends.

import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np      

import multiprocessing as mp

def worker():
    N=1000000
    a = np.arange(N)
    b = np.random.randn(N)

    fig = plt.figure(num=1, dpi=100, facecolor='w', edgecolor='w')
    fig.set_size_inches(10,7)
    ax = fig.add_subplot(111)
    ax.plot(a, b)

    fig.savefig('/tmp/random.png')   # code gives me an error here

if __name__=='__main__':
    proc=mp.Process(target=worker)
    proc.daemon=True
    proc.start()
    proc.join()

You don't have to proc.join() either. The join will block the main process until the worker completes. If you omit the join, then the main process simply continues with the worker process working in the background.

Haubergeon answered 19/8, 2011 at 18:33 Comment(4)
The code I posted fails the first time through. It was created to recreate the special case of a user zooming in too far on the y-axis for high sample rate data. If a plot looks like a wash of blue without any white background showing, the code fails. However, your solution looks like a better way of managing memory. I am a novice and I don't fully understand what is going on with the if __name__ == '__main__': block. I will try to add this to my code. Can you point me to a resource that explains what is going on? Or can you offer a quick explanation. Thanks.Quirinus
@sequoia: In that case, maybe you need to restrict the user so in no case can the user request 1e6 points to be plotted. The if __name__... block is not necessary unless you are on Windows. I'd be glad to try to explain any specific questions you have, but I think in this case everything is explained much better than I can hereHaubergeon
Thanks, that is a helpful link. I am going to use your implementation. And take your suggestion to limit the extent of the user requests.Quirinus
Wow! I can't believe how sweetly this workaround solved a similar problem for me. It's not a solution to the underlying problem, sure. But it is a good technique to have in the quiver for emergencies.Porphyry
G
20

I find here http://www.mail-archive.com/[email protected]/msg11809.html , it gives an interesting answer that may help

try replacing :

import matplotlib.pyplot as plt
fig = plt.figure()

with

from matplotlib import figure
fig = figure.Figure()
Garner answered 6/9, 2012 at 12:20 Comment(1)
This solution works for the error "MemoryError: In RendererAgg: Out of memory" when saving multiple plots using a for loopNinfaningal
H
15

I assume you can run the code you posted at least once. The problem only manifests itself after running the posted code many times. Correct?

If so, the following avoids the problem without really identifying the source of the problem. Maybe that is a bad thing, but this works in a pinch: Simply use multiprocessing to run the memory-intensive code in a separate process. You don't have to worry about fig.clf() or plt.close() or del a,b or gc.collect(). All memory is freed when the process ends.

import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import numpy as np      

import multiprocessing as mp

def worker():
    N=1000000
    a = np.arange(N)
    b = np.random.randn(N)

    fig = plt.figure(num=1, dpi=100, facecolor='w', edgecolor='w')
    fig.set_size_inches(10,7)
    ax = fig.add_subplot(111)
    ax.plot(a, b)

    fig.savefig('/tmp/random.png')   # code gives me an error here

if __name__=='__main__':
    proc=mp.Process(target=worker)
    proc.daemon=True
    proc.start()
    proc.join()

You don't have to proc.join() either. The join will block the main process until the worker completes. If you omit the join, then the main process simply continues with the worker process working in the background.

Haubergeon answered 19/8, 2011 at 18:33 Comment(4)
The code I posted fails the first time through. It was created to recreate the special case of a user zooming in too far on the y-axis for high sample rate data. If a plot looks like a wash of blue without any white background showing, the code fails. However, your solution looks like a better way of managing memory. I am a novice and I don't fully understand what is going on with the if __name__ == '__main__': block. I will try to add this to my code. Can you point me to a resource that explains what is going on? Or can you offer a quick explanation. Thanks.Quirinus
@sequoia: In that case, maybe you need to restrict the user so in no case can the user request 1e6 points to be plotted. The if __name__... block is not necessary unless you are on Windows. I'd be glad to try to explain any specific questions you have, but I think in this case everything is explained much better than I can hereHaubergeon
Thanks, that is a helpful link. I am going to use your implementation. And take your suggestion to limit the extent of the user requests.Quirinus
Wow! I can't believe how sweetly this workaround solved a similar problem for me. It's not a solution to the underlying problem, sure. But it is a good technique to have in the quiver for emergencies.Porphyry

© 2022 - 2024 — McMap. All rights reserved.