The reason this happens is because on Windows multiprocessing uses method spawn
to start new processes. As you mentioned, unlike fork
, this method duplicates everything the process requires to start the work in the process itself. Therefore, if you pass any objects as arguments to start a process, all required data held by the object is serialized using pickle, and sent to the process, where the object is recreated/duplicated using that data.
How does this affect your code?
You are passing an instance of class Foo
as an argument to init_child
using the Process
class. What happens after this is that the data inside the instance, along with the full name and path of the class, is pickled and sent to the new process. Keep in mind, that the pickled data does not include any of the code, or class attributes of the class, just it's name. Then using the name of the class and the instance data, a new object is created, ready to be used.
As you may have noticed, this whole process will only work if the class Foo
has already been defined in the new process, and to do that, the module that defines Foo
must be imported their as well. Multiprocessing automatically handles all this for you, which is the reason for the ambiguity in your case.
To make things clearer, run this below code, which prints the PID of the worker process started as well, and creates 3 workers:
from multiprocessing import Process
import os
# target func for new process
def init_child(foo_obj, bar_obj):
print(f"Worker Process loaded in PID: {os.getpid()}")
pass
if __name__ == "__main__":
# protect imports from child process
from foo import getFoo
from bar import getBar
# get new objects
foo_obj = getFoo()
bar_obj = getBar()
for _ in range(3):
print()
# start new process
child_p = Process(target=init_child, args=(foo_obj, bar_obj))
child_p.start()
# Wait for process to join
child_p.join()
Output
Foo Loaded by PID: 41320
Bar Loaded by PID: 41320
Foo Loaded by PID: 23864
Worker Process loaded in PID: 23864
Foo Loaded by PID: 90256
Worker Process loaded in PID: 90256
Foo Loaded by PID: 123352
Worker Process loaded in PID: 123352
As you can see, the foo.py
is getting imported in the same process as the worker to recreate the instance.
So, what's the solution?
You can use managers. These spawn a server process, where the managed object is created and stored, allowing other processes to access it's shared state without creating a duplicate. In your case, foo.py
will still be imported one extra time to create the instance in the server process. However, after that is done, you can freely pass the instance to other processes without worrying about the module foo.py
being imported in that process.
main.py
from multiprocessing import Process
from multiprocessing.managers import BaseManager
import os
# target func for new process
def init_child(foo_obj, bar_obj):
print(f"Worker Process loaded in PID: {os.getpid()}")
pass
if __name__ == "__main__":
# protect imports from child process
import ultra_random_test
from foo import get_foo_with_manager, Foo
from bar import getBar
BaseManager.register('Foo', Foo)
manager = BaseManager()
manager.start()
# get new objects
foo_obj = get_foo_with_manager(manager)
bar_obj = getBar()
# start new process
for _ in range(3):
print()
child_p = Process(target=init_child, args=(foo_obj, bar_obj))
child_p.start()
# Wait for process to join
child_p.join()
foo.py
import os
print('Foo Loaded by PID: ' + str(os.getpid()))
class Foo:
def __init__(self):
pass
def get_foo_with_manager(manager):
return manager.Foo()
Output
Foo Loaded by PID: 38332
Bar Loaded by PID: 38332
Foo Loaded by PID: 56632
Worker Process loaded in PID: 63404
Worker Process loaded in PID: 66400
Worker Process loaded in PID: 56044
As you can see, only one extra import to instantiate foo_obj
in the server process created by manager
. After that, foo.py
was never imported again.
Benchmark for managers
I test how fast proxies are compared to unmanaged objects, by comparing the speed of getting attribute values using NamespaceProxy
:
import time
from multiprocessing.managers import BaseManager, NamespaceProxy
class TimeNamespaceProxy(NamespaceProxy):
def __getattr__(self, key):
t = time.time()
super().__getattr__(key)
return time.time() - t
class A:
def __init__(self):
self.num = 3
def worker(proxy_a):
times = []
for _ in range(10000):
times.append(proxy_a.num)
print(
f'Time taken for 10000 calls to get attribute with proxy is : {sum(times)}, with the avg being: {sum(times) / len(times)}')
if __name__ == "__main__":
BaseManager.register('A', A, TimeNamespaceProxy, exposed=tuple(dir(TimeNamespaceProxy)))
manager = BaseManager()
manager.start()
num = 10000
a = A()
proxy_a = manager.A()
times = []
t = time.perf_counter()
for _ in range(num):
a.num
times.append(time.perf_counter() - t)
print(f'Time taken for {num} calls to get attribute without proxy is : {sum(times)}, with the avg being: {sum(times) / num}')
times = []
for _ in range(10000):
times.append(proxy_a.num)
print(
f'Time taken for {num} calls to get attribute with proxy is : {sum(times)}, with the avg being: {sum(times) / num}')
Output
Time taken for 10000 calls to get attribute without proxy is : 0.0011279000000000705, with the avg being: 1.1279000000000705e-07
Time taken for 10000 calls to get attribute with proxy is : 1.751499891281128, with the avg being: 0.0001751499891281128
As you can see, without using proxies is more than a 1000 times faster, but, in absolute terms, using proxies only takes about 0.2 ms to pickle/unpickle the function name __getattribute__
and the argument, and then pickle/unpickle the result. For most applications, simply sending over the function name over to the manager process will not be a bottleneck. Things will only get tricky if the return value/arguments to be sent over are complex and take longer to pickle, but these are common shenanigans when working with multiprocessing in general (I would recommend storing the recurring arguments in the managed object itself if possible to reduce overhead).
In short, implementing process safety will always have a tradeoff in performance. If your use-case requires particularly performant code, then you might want to re-evaluate your choice of language because the tradeoff is especially large with python
init_obj
would be a proxy object and when a method call is made on this proxy, it is implemented by sending a message via either a socket or named pipe to a thread started by the manger process to field such a method request and to perform the call on the actual object living in the manager's process. In short, there is execution overhead you would not have with unmanaged objects resulting in much slower performance for every method call. – Plantar