- I am using a
multiprocessing.Pool
which calls a function in 1 or more subprocesses to produce a large chunk of data. - The worker process creates a
multiprocessing.shared_memory.SharedMemory
object and uses the default name assigned byshared_memory
. - The worker returns the string name of the
SharedMemory
object to the main process. - In the main process the
SharedMemory
object is linked to, consumed, and then unlinked & closed.
At shutdown I'm seeing warnings from resource_tracker
:
/usr/local/lib/python3.8/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 10 leaked shared_memory objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '
/usr/local/lib/python3.8/multiprocessing/resource_tracker.py:229: UserWarning: resource_tracker: '/psm_e27e5f9e': [Errno 2] No such file or directory: '/psm_e27e5f9e'
warnings.warn('resource_tracker: %r: %s' % (name, e))
/usr/local/lib/python3.8/multiprocessing/resource_tracker.py:229: UserWarning: resource_tracker: '/psm_2cf099ac': [Errno 2] No such file or directory: '/psm_2cf099ac'
<8 more similar messages omitted>
Since I unlinked the shared memory objects in my main process I'm confused about what's happening here. I suspect these messages are occurring in the subprocess (in this example I tested with a process pool of size 1).
Here is a minimum reproducible example:
import multiprocessing
import multiprocessing.shared_memory as shared_memory
def create_shm():
shm = shared_memory.SharedMemory(create=True, size=30000000)
shm.close()
return shm.name
def main():
pool = multiprocessing.Pool(processes=4)
tasks = [pool.apply_async(create_shm) for _ in range(200)]
for task in tasks:
name = task.get()
print('Getting {}'.format(name))
shm = shared_memory.SharedMemory(name=name, create=False)
shm.close()
shm.unlink()
pool.terminate()
pool.join()
if __name__ == '__main__':
main()
I have found that running that example on my own laptop (Linux Mint 19.3) it runs fine, however running it on two different server machines (unknown OS configurations, but both different) it does exhibit the problem. In all cases I'm running the code from a docker container, so Python/software config is identical, the only difference is the Linux kernel/host OS.
I notice this documentation that might be relevant: https://docs.python.org/3.8/library/multiprocessing.html#contexts-and-start-methods
I also notice that the number of "leaked shared_memory objects" varies from run to run. Since I unlink in main process, then immediately exit, perhaps this resource_tracker
(which I think is a separate process) has just not received an update before the main process exits. I don't understand the role of the resource_tracker
well enough to fully understand what I just proposed though.
Related topics: