I read that RawArray
can be shared between proceses without being copied, and wanted to understand how it is possible in Python.
I saw in sharedctypes.py, that a RawArray
is constructed from a BufferWrapper
from heap.py, then nullified with ctypes.memset
.
BufferWrapper
is made of an Arena
object, which itself is built from an mmap
(or 100 mmaps in windows, see line 40 in heap.py)
I read that the mmap
system call is actually used to allocate memory in Linux/BSD, and the Python module uses MapViewOfFile for windows.
mmap
seems handy then. It seems to be able to work directly with mp.pool
-
from struct import pack
from mmap import mmap
def pack_into_mmap(idx_nums_tup):
idx, ints_to_pack = idx_nums_tup
pack_into(str(len(ints_to_pack)) + 'i', shared_mmap, idx*4*total//2 , *ints_to_pack)
if __name__ == '__main__':
total = 5 * 10**7
shared_mmap = mmap(-1, total * 4)
ints_to_pack = range(total)
pool = Pool()
pool.map(pack_into_mmap, enumerate((ints_to_pack[:total//2], ints_to_pack[total//2:])))
My question is -
How does the multirocessing module know not to copy the mmap
based RawArray
object between processes, like it does with "regular" python objects?