Out of Memory Killer activated for python script running a multiprocessing Queue?
Asked Answered
P

2

8

I have written a python program that needs to run for multiple days at a time, because of the constant collection of data. Previously I had no issues running this program for months at a time. I recently made some updates to the program, and now after around 12 hours I get the dreaded out of memory killer. The 'dmesg' output is the following:

[9084334.914808] Out of memory: Kill process 2276 (python2.7) score 698 or sacrifice child
[9084334.914811] Killed process 2276 (python2.7) total-vm:13279000kB, anon-rss:4838164kB, file-rss:8kB

Besides just general python coding, the main change made to the program was the addition of a multiprocessing Queue. This is the first time I have ever used this feature, so I am not sure if this might be the cause of the issue. The purpose of the Queue in my program is to be able to make dynamic changes in a parallel process. The Queue is initiated in the main program and in continually being monitored in the parallel process. A simplified version of how I am doing this in the parallel process is the following (with 'q' being the Queue):

while(1):

    if q.empty():
        None

    else:
        fr = q.get()
        # Additional code

    time.sleep(1)

The dynamic changes to 'q' do not happen very often so majority of the time q.empty() will be true, but the loop is there to be ready as soon as changes are made. My question is, would running this code for multiple hours at a time cause the memory to eventually run low? With the 'while' loop being pretty short and running basically non stop, I was thinking this might be an problem. If this could be the cause of the problem, does anybody have any suggestions on how to improve the code so the out of memory killer doesn't get called?

Thank you very much.

Patton answered 27/2, 2014 at 16:1 Comment(6)
if q.empty(): pass is the idiomatic way to write that block. But it would be even better to just have if not q.empty(): fr = q.get() in the first place.Snowplow
What implementation are you using? Is .get() removing the element from the queue or only inspecting it? In the latter case, the size of the queue is monotonically increasing.Quaff
do you have swap enabled? what's the output of free?Trudietrudnak
Thank you for all the comments. The way I am using .get() is to remove a list from the Queue. So I am using .put() in the main program to send the list to the parallel process and .get() is what picks it up. I assumed previously that after .get() is entered, the Queue is cleared, but maybe I am wrong.Patton
I stopped running the program some time ago, but here is the current output of 'free': total used free shared buffers cached Mem: 8177396 1659060 6518336 0 255036 560160 -/+ buffers/cache: 843864 7333532 Swap: 10027004 869020 9157984Patton
Did you try change /proc/<pid>/oom_score for your process?Metallize
S
6

The only way you can run out of memory in the way you describe is if you're using more and more memory as time goes on. The loop here does not demonstrate this behavior, so it cannot be (solely) responsible for any memory errors. Running a tight, infinite loop can burn through a lot of needless processor cycles, but it can't cause a MemoryError by itself unless it's storing data to someplace else.

It's likely that elsewhere in your code, you're holding onto some variables that you don't intend to. This is called a memory leak, and you can use a memory profiler to look for where such a leak is coming from.

Some likely suspects are caching methods used to improve performance, or lists of variables that never leave scope. Perhaps your multiprocessing queue is holding on to references to earlier data objects, or items are never deleted from the queue once they're inserted? (This latter case is unlikely given the code you've shown if you're using the builtin queue.Queue, but anything is possible).

Snowplow answered 27/2, 2014 at 16:7 Comment(5)
This is not an answer. Post it as a comment.Quaff
@Stefano How is this not an answer? "My question is, would running this code for multiple hours at a time cause the memory to eventually run low?" The answer is No, the given code cannot cause the program to run out of memory.Snowplow
Your answer reads: I cannot conclude that your code is causing this error by itself, please provide more details. This is what I would put in a comment. And that piece of (pseudo)code can well be the cause, because you don't know the behaviour of queue.get.Quaff
@Stefano I'm assuming, for lack of other information, that he's using queue.Queue, which would not cause this behavior. The asker didn't ask us to find the memory leak, but rather if it could be caused by the tight loop in a parallel process. I've edited my answer to hopefully read less like a request for more information, though.Snowplow
The question is not strictly limited to the loop, " would running this code for multiple hours at a time cause the memory to eventually run low?". And, programmers should beware of assumptions. When in doubt, ask the OP.Quaff
B
-1

You can convert your program into a linux service and set its oom policy to continue.

You can check this and this links to see how to see/edit service parameters and see oom policy service parameter respectively.

Bonner answered 10/8, 2020 at 13:21 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.