How does shared_memory
circumvent the pickle treatment?
I think you are confusing shared ctypes and shared objects between processes.
First, you don't have to use the sharing mechanisms provided by multiprocessing
in order to get shared objects, you can just wrap basic primitives such as mmap
/ Windows-equivalent or get fancier using any API that your OS/kernel provides you.
Next, the second link you mention regarding how copy is done and how __getstate__
defines the behavior of the pickling is dependent on you — using the sharedctypes
module API. You are not forced to perform pickling to share memory between two processes.
In fact, sharedctypes
is backed by anonymous shared memory which uses: https://github.com/python/cpython/blob/master/Lib/multiprocessing/heap.py#L31
Both implementations relies on an mmap
-like primitive.
Anyway, if you try to copy something using sharedctype
, you will hit:
And this function is using ForkingPickler
which will make use of pickle
and then… ultimately, you'll call __getstate__
somewhere.
But it's not relevant with shared_memory
, because shared_memory
is not really a ctype
-like object.
You have other ways to share objects between processes, using the Resource Sharer / Tracker API: https://github.com/python/cpython/blob/master/Lib/multiprocessing/resource_sharer.py which will rely on pickle
serialization/deserialization.
But you don't share shared memory through shared memory, right?
When you use: https://github.com/python/cpython/blob/master/Lib/multiprocessing/shared_memory.py
You create a block of memory with a unique name, and all processes must have the unique name before sharing the memory, otherwise you will not be able to attach it.
Basically, the analogy is:
You have a group of friends and you all have a unique secret base that only you have the location, you will go on errands, be away from each other, but you can all meet at this unique location.
In order for this to work, you must all know the location before going away from each other. If you do not have it beforehand, you are not certain that you will be able to figure out the place to meet them.
That is the same with the shared_memory
, you only need its name to open it. You don't share / transfer shared_memory
between processes. You read into shared_memory
using its unique name from multiple processes.
As a result, why would you pickle it? You can. You can absolutely pickle it. But that might not be built-in, because it's straightforward to just send the unique name to all your processes through another shared memory channel or anything like that.
There is no circumvention required here. ShareableList
is just an example of application of SharedMemory
class. As you can see it here: https://github.com/python/cpython/blob/master/Lib/multiprocessing/shared_memory.py#L314
It requires something akin to a unique name, you can use anonymous shared memory also and transmit its name later through another channel (write a temporary file, send it back to some API, whatever).
Why then is CreateFileMapping \ OpenFileMapping needed here?
Because it depends on your Python interpreter, here you are might be using CPython, which is doing the following:
https://github.com/python/cpython/blob/master/Modules/mmapmodule.c#L1440
It's already using CreateFileMapping
indirectly so that doing CreateFileMapping
then attaching it is just duplicating the already-done work in CPython.
But, what about others interpreters? Do all interpreters perform the necessary to make mmap
work on non-POSIX platforms? Maybe the rationale of the developer would be this.
Anyway, it is not surprising that mmap
would work out of the box.
shared_memory
dodges pickling because it only provides amemoryview
that wraps the shared buffer. You can read and write raw bytes, but you cannot pass objects toSharedMemory
. It has no interface for that. To get an object into memory, you would need to serialize it to raw bytes and blast those into the buffer. Pickling creeps back into the equation because of the serialization step. Got no clue on the second question. Lastly, note thatmmap
plays a role in both the unix and windows branches in the constructor. See line 111. Disclaimer, not an authority on anything. – Apropos