Reduce memory fragmentation with MALLOC_MMAP_THRESHOLD_ and MALLOC_MMAP_MAX_
Asked Answered
C

2

16

I've been experimenting with MALLOC_MMAP_THRESHOLD_ and MALLOC_MMAP_MAX_ env variables to affect memory management in a long-running Python 2 process. See http://man7.org/linux/man-pages/man3/mallopt.3.html

I got the idea from this bug report: http://bugs.python.org/issue11849

The results I have are encouraging: memory fragmentation is reduced and the typical high-water mark visible in memory used by long-running processes is lower.

My only concern is if there are other side effects that may bite back, when using such low level tweaks. Does anyone have any experience in using them?

Here is an example script that shows how those variables affect RSS memory in a script that generate a large dictionary: https://gist.github.com/lbolla/8e2640133032b0a6bb9c Just run "alloc.sh" and compare the output. Here is the output for me:

MALLOC_MMAP_THRESHOLD_=None MALLOC_MMAP_MAX_=None
N=9 RSS=120968
MALLOC_MMAP_THRESHOLD_=512 MALLOC_MMAP_MAX_=None
N=9 RSS=157008
MALLOC_MMAP_THRESHOLD_=1024 MALLOC_MMAP_MAX_=None
N=9 RSS=98484
MALLOC_MMAP_THRESHOLD_=2048 MALLOC_MMAP_MAX_=None
N=9 RSS=98484
MALLOC_MMAP_THRESHOLD_=4096 MALLOC_MMAP_MAX_=None
N=9 RSS=98496
MALLOC_MMAP_THRESHOLD_=100000 MALLOC_MMAP_MAX_=None
N=9 RSS=98528
MALLOC_MMAP_THRESHOLD_=512 MALLOC_MMAP_MAX_=0
N=9 RSS=121008
MALLOC_MMAP_THRESHOLD_=1024 MALLOC_MMAP_MAX_=0
N=9 RSS=121008
MALLOC_MMAP_THRESHOLD_=2048 MALLOC_MMAP_MAX_=0
N=9 RSS=121012
MALLOC_MMAP_THRESHOLD_=4096 MALLOC_MMAP_MAX_=0
N=9 RSS=121000
MALLOC_MMAP_THRESHOLD_=100000 MALLOC_MMAP_MAX_=0
N=9 RSS=121008
MALLOC_MMAP_THRESHOLD_=512 MALLOC_MMAP_MAX_=16777216
N=9 RSS=157004
MALLOC_MMAP_THRESHOLD_=1024 MALLOC_MMAP_MAX_=16777216
N=9 RSS=98484
MALLOC_MMAP_THRESHOLD_=2048 MALLOC_MMAP_MAX_=16777216
N=9 RSS=98484
MALLOC_MMAP_THRESHOLD_=4096 MALLOC_MMAP_MAX_=16777216
N=9 RSS=98496
MALLOC_MMAP_THRESHOLD_=100000 MALLOC_MMAP_MAX_=16777216
N=9 RSS=98528

As you can see, RSS used is about 20% less than vanilla Python for this example.

Chivalry answered 26/2, 2016 at 20:14 Comment(3)
One way to work around this is do the work in a forked process that then exits.Klaraklarika
@Klaraklarika I can't do that, because the process in question is long-running. It's a server, supposed to be there for a long time.Chivalry
@Ibolla you can do this even in server cases. Fork from server process, run memory allocating operation, return from forked process to server process, terminate forked process, return result to client that requested it. Now, that doesn't always mean you've solved the problem. Maybe the input and output are so large they're going to require huge memory allocations on the server anyway. YMMV but you can do it.Jozef
C
5

Being in production with this tweak for a long time now, without issues. So, I think it's a viable option to improve memory usage in long-running Python processes, in certain cases.

Chivalry answered 30/10, 2016 at 20:54 Comment(0)
S
1

I'm also using:

MALLOC_MMAP_THRESHOLD_=8192

MALLOC_ARENA_MAX=4

So far, great results!

Sedation answered 20/7, 2020 at 13:38 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.