My Flask API has a small memory leak that over a number of API calls causes my application to hit it's memory limit and crash. I've been trying to figure out why some memory isn't being released with no success so far, I believe I do know the sources. Id appreciate any help!
Unfortunately I can't share the code but to describe it in English, my flask app provides an API endpoint for a user to do the following (all in one call):
- Pull some data from MongoDB based on an ID provided.
- From what's returned, build a document object using the python-docx library and save that to disk.
- Finally, I take what was saved to disk and upload it to an S3 bucket then delete what was on disk.
From what I can tell, using the memory_profiler library the two areas where I am seeing the most memory usage is the initialization of the Document object and connecting/saving to S3 (7MB and 4.8MB respectively).
What I am doing to monitor the memory usage of my Python process is I'm having psutils print out the rss memory used at certain key points (example code below).
process = psutil.Process(os.getpid())
mem0 = process.memory_info().rss
print('Memory Usage After Action',mem0/(1024**2),'MB')
## Perform some action
mem1 = process.memory_info().rss
print('Memory Usage After Action',mem1/(1024**2),'MB')
print('Memory Increase After Action',(mem1-mem0)/(1024**2),'MB')
The console image provided is after I've called the app three times while hosting it locally.
What's concerning is that every sequential API call seems to start at or above where the last call left the memory used amount at and continues to add on to it. The app starts at 93MB (see yellow highlights) but after the first call it ends at 103.79MB, the second starts at 103.87MB and ends at 105.39MB, and the third starts at 105.46Mb and ends at 106MB. There is diminishing usage amounts but after 100 calls I still see incremental memory usage. The red and blue lines show the memory changes at various points during the API call. The red lines are after the document build and the blue lines are after the S3 upload.
Please note that my test program is calling the API with the same parameters every time.
I have tested, among other things, the following:
- gc.collect().
- explicitly deleting variable/object references using 'del'.
- ensuring that the mongo connection is closed (since I'm using the IBM_Botos3 library for an S3 connection I don't know if there's a way to explicitly close this connection).
- No global variables that I'm saving to with each API call (app is the only global variable).
I know since I cant provide code there may not be much to go off of here but if there are no ideas I was wondering if there's a best practice way to handle flask memory usage or a way to clear out memory after the flask function returns something. Right now my flask functions are relatively standard Python functions (so I'd expect local variables inside this function to be garbage collected afterwards).
I am using Python 3.6, Flask 0.11.1, and pymongo 3.6.1, my tests are right now on a windows 7 machine but my IBM cloud server is seeing the same issue.