How to limit the heap size?
Asked Answered
B

3

54

I sometimes write Python programs which are very difficult to determine how much memory it will use before execution. As such, I sometimes invoke a Python program that tries to allocate massive amounts of RAM causing the kernel to heavily swap and degrade the performance of other running processes.

Because of this, I wish to restrict how much memory a Python heap can grow. When the limit is reached, the program can simply crash. What's the best way to do this?

If it matters, much code is written in Cython, so it should take into account memory allocated there. I am not married to a pure Python solution (it does not need to be portable), so anything that works on Linux is fine.

Biographer answered 22/2, 2010 at 0:21 Comment(3)
I'm confused about this question. It seems to include an answer but doesn't indicate what's wrong with it.Nagey
Looks like he's copied the accepted answer's code into his question. Presumably it's the solution?Erik
@Nagey I removed the answer from the questionVoile
V
52

Check out resource.setrlimit(). It only works on Unix systems but it seems like it might be what you're looking for, as you can choose a maximum heap size for your process and your process's children with the resource.RLIMIT_DATA parameter.

EDIT: Adding an example:

import resource

rsrc = resource.RLIMIT_DATA
soft, hard = resource.getrlimit(rsrc)
print 'Soft limit starts as  :', soft

resource.setrlimit(rsrc, (1024, hard)) #limit to one kilobyte

soft, hard = resource.getrlimit(rsrc)
print 'Soft limit changed to :', soft

I'm not sure what your use case is exactly but it's possible you need to place a limit on the size of the stack instead with resouce.RLIMIT_STACK. Going past this limit will send a SIGSEGV signal to your process, and to handle it you will need to employ an alternate signal stack as described in the setrlimit Linux manpage. I'm not sure if sigaltstack is implemented in python, though, so that could prove difficult if you want to recover from going over this boundary.

Vesta answered 22/2, 2010 at 3:33 Comment(8)
Can you give an example? I tried setting various resource rlimits, but I was still able to allocate a gigabyte list. The resource module "feels right", but I can't get it to work on Linux.Biographer
I added my test to the question. My understanding is that the Python program should quit once the limit is reached.Biographer
Also, I don't care what happens if the limit is reached -- the program can crash, hang, or whatever, just as long as it doesn't keep allocating more memory.Biographer
If you don't care about recovery then don't worry about the sigaltstack stuff. Just set your stack limit with setrlimit(resource.RLIMIT_STACK, (soft, hard)). A regular process can only lower its hard limit, which just sets a maximum for the soft limit.Vesta
yeah using RLIMIT_DATA is not working for me either, but RLIMIT_STACK is (Mac OS X).Vesta
Neither RLIMIT_DATA nor RLIMIT_STACK work for me. On Linux 64-bit here.Mediocre
If RLIMIT_DATA doesn't work for you, try RLIMIT_AS (maximum address space). @YangWhereupon
How is this different than setting "ulimit -m 1500000" (1.5GB) before calling python? Does python really not have the equivalent of the Java JVM's "-XX:MaxRAM=1.5g" which tells Java how big to set the maximum heap size and to gc when nearing it?Obstruct
F
1

Have a look at ulimit. It allows resource quotas to be set. May need appropriate kernel settings as well.

Freemon answered 22/2, 2010 at 0:27 Comment(2)
When you're using PAM anyways, /etc/security/limits.conf is the better place to set limits. ulimit is only a Bash shell function.Romonaromonda
ulimit uses the same system calls as the Python resource moduleClari
T
0

Following code allocates memory to specified maximum resident set size

import resource

def set_memory_limit(memory_kilobytes):
    # ru_maxrss: peak memory usage (bytes on OS X, kilobytes on Linux)
    usage_kilobytes = lambda: resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
    rlimit_increment = 1024 * 1024
    resource.setrlimit(resource.RLIMIT_DATA, (rlimit_increment, resource.RLIM_INFINITY))

    memory_hog = []

    while usage_kilobytes() < memory_kilobytes:
        try:
            for x in range(100):
                memory_hog.append('x' * 400)
        except MemoryError as err:
            rlimit = resource.getrlimit(resource.RLIMIT_DATA)[0] + rlimit_increment
            resource.setrlimit(resource.RLIMIT_DATA, (rlimit, resource.RLIM_INFINITY))

set_memory_limit(50 * 1024)  # 50 mb

Tested on linux machine.

Tot answered 4/12, 2018 at 16:50 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.