What uses the memory of my python process? (RSS vs VMS)
Asked Answered
S

1

10

If I execute the Python interpreter it needs roughly 111 MByte:

>>> import psutil
>>> psutil.Process().memory_info()
pmem(rss=19451904, vms=111677440, shared=6905856, text=4096, lib=0, data=12062720, dirty=0)

After importing django it uses 641 MByte

>>> import django
>>> django.setup()
>>> psutil.Process().memory_info()
pmem(rss=188219392, vms=641904640, shared=27406336, text=4096, lib=0, data=284606464, dirty=0)

And the WSGI process (which has already executed some http requests) 919 MByte:

>>> psutil.Process(13843).memory_info()
pmem(rss=228777984, vms=919306240, shared=16076800, text=610304, lib=0, data=485842944, dirty=0)

I think that's too much.

What can I do to investigate this in more detail? What occupies the memory?

Background: From time to time memory on the server is running low and the oom-killer terminates processes.

Scarify answered 3/12, 2019 at 15:57 Comment(5)
Bear in mind that memory managers are lazy. They don't go out of their way to "clean things up" if they don't have to. Likewise, the operating system doesn't bother to steal memory away from any process if there's not other pressure being placed on the memory resource ... which is unlikely, in today's capacious machines especially your development box. "Memory use" might be considerably over-stated due to this laziness.Bandsman
@MikeRobinson I updated the question: Background: From time to time memory on the server is running low and the oom-killer terminates processes.Scarify
rss is the thing that is of interest, not virtual memory. Use memory profiler to find the cause of excessive memory utilization.Cenacle
To expand on zgoda's comment: vms essentially tells you how much address space your program has been allocated, not how much RAM is in use. See eg. this answer.Betake
Any global state (global variables, class attributes...) is a first and obvious culprit. Note that you definitly don't want any mutable global state in a wsgi app anyway. Also, Python is known for never releasing memory back to the OS, so if you have some memory hungry processing you want to try and run it as a distinct process (subprocess, async task queue or whatever). And finally, even with those precautions, you want to have something that kills and restart your server processes every once in a while (some wsgi bridges do have this kind of options, ie based on how many requests or such).Radioactivate
B
11

You're looking at the wrong attribute:

  • rss is the Resident Set Size, which is the actual physical memory the process is using
  • vms is the Virtual Memory Size which is the virtual memory that process is using

Kernels allow a process to get a different view of the memory where the process thinks like it is the only program running in the system, that's why the virtual address space is for. While in reality kernel uses memory management to synchronize memory usage between processes. Also note that, the shared libraries between processes play a part in memory consumption as well.

Regarding your OOM incident, see which process is being killed and see what's the process was doing. For example, Linux uses /proc/PID/oom_score to keep track of each processes OOM score to find which process to kill in OOM situations -- higher value indicates a higher probability of selection. Linux sets this value based on different heuristics e.g. number of children, how long it's running, CPU usage, niceness and so on. And you can tweak this for the process by writing to /proc/PID/oom_score_adj.

But don't influence the OOM score, try to debug the actual problem in the process. A memory profiler like valgrind might be helpful in this regard.

Bacchius answered 3/12, 2019 at 16:20 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.