Basically, the reason you are getting this error is because multiprocessing uses pickle, which can only serialize top-module level functions in general. Function addi
is not a top-module level function. In fact, the line global addi
is not doing anything because addi
has never been declared in the outer module. So you have three ways to fix this.
Method 1
You can define addi
in the global scope before executing calc
function:
import multiprocessing as mp
import os
def addi(num1, num2):
print(num1 + num2)
def calc(num1, num2):
m = mp.Process(target=addi, args=(num1, num2))
m.start()
print("here is main", os.getpid())
m.join()
if __name__ == "__main__":
# creating processes
calc(5, 6)
Output
here is main 9924
11
Method 2
You can switch to multiprocess, which uses dill instead of pickle, and can serialize such functions.
import multiprocess as mp # Note that we are importing "multiprocess", no "ing"!
import os
def calc(num1, num2):
def addi(num1, num2):
print(num1 + num2)
m = mp.Process(target=addi, args=(num1, num2))
m.start()
print("here is main", os.getpid())
m.join()
if __name__ == "__main__":
# creating processes
calc(5, 6)
Output
here is main 67632
11
Method 2b
While it's a useful library, there are a few valid reasons why you may not want to use multiprocess
. A big one is the fact that the standard library's multiprocessing
and this fork are not compatible with each other (especially if you use anything from within the subpackage multiprocessing.managers
). This means that if you are using this fork in your own project, but also use third-party libraries which themselves use the standard library's multiprocesing
instead, you may see unexpected behaviour.
Anyway, in cases where you want to stick with the standard library's multiprocessing
and not use the fork, you can use dill
yourself to serialize python closures like the function addi
by subclassing the Process
class and adding some of our own logic. An example is given below:
import dill
from multiprocessing import Process # Use the standard library only
import os
class DillProcess(Process):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self._target = dill.dumps(self._target) # Save the target function as bytes, using dill
def run(self):
if self._target:
self._target = dill.loads(self._target) # Unpickle the target function before executing
self._target(*self._args, **self._kwargs) # Execute the target function
def calc(num1, num2):
def addi(num1, num2):
print(num1 + num2)
m = DillProcess(target=addi, args=(num1, num2)) # Note how we use DillProcess, and not multiprocessing.Process
m.start()
print("here is main", os.getpid())
m.join()
if __name__ == "__main__":
# creating processes
calc(5, 6)
Output
here is main 23360
11
Method 3
This method is for those who cannot use any third-party libraries in their code. I will recommend making sure that the above methods did not work before resorting to this one because it's a little hacky and you do need to restructure some of your code.
Anyways, this method works by referencing your local functions in the top-module scope, so that they become accessible by pickle. To do this dynamically, we create a placeholder class and add all the local functions as its class attributes. We would also need to make sure that the functions' __qualname__
attribute is altered to point to their new location, and that this all is done every run outside the if __name__ ...
block (otherwise newly started processes won't see the attributes). Consider a slightly modified version of your code here:
import multiprocessing as mp
import os
def calc(num1, num2):
def addi(num1, num2):
print(num1 + num2)
# Another local function you might have
def addi2():
print('hahahaha')
m = mp.Process(target=addi, args=(num1, num2))
m.start()
print("here is main", os.getpid())
m.join()
if __name__ == "__main__":
# creating processes
calc(5, 6)
Below is a how you can make it work by using the above detailed method:
import multiprocessing as mp
import os
# This is our placeholder class, all local functions will be added as it's attributes
class _LocalFunctions:
@classmethod
def add_functions(cls, *args):
for function in args:
setattr(cls, function.__name__, function)
function.__qualname__ = cls.__qualname__ + '.' + function.__name__
def calc(num1, num2, _init=False):
# The _init parameter is to initialize all local functions outside __main__ block without actually running the
# whole function. Basically, you shift all local function definitions to the top and add them to our
# _LocalFunctions class. Now, if the _init parameter is True, then this means that the function call was just to
# initialize the local functions and you SHOULD NOT do anything else. This means that after they are initialized,
# you simply return (check below)
def addi(num1, num2):
print(num1 + num2)
# Another local function you might have
def addi2():
print('hahahaha')
# Add all functions to _LocalFunctions class, separating each with a comma:
_LocalFunctions.add_functions(addi, addi2)
# IMPORTANT: return and don't actually execute the logic of the function if _init is True!
if _init is True:
return
# Beyond here is where you put the function's actual logic including any assertions, etc.
m = mp.Process(target=addi, args=(num1, num2))
m.start()
print("here is main", os.getpid())
m.join()
# All factory functions must be initialized BEFORE the "if __name__ ..." clause. If they require any parameters,
# substitute with bogus ones and make sure to put the _init parameter value as True!
calc(0, 0, _init=True)
if __name__ == '__main__':
a = calc(5, 6)
So there are a few things you would need to change in your code, namely that all local functions inside are defined at the top and all factory functions need to be initialized (for which they need to accept the _init
parameter) outside the if __name__ ...
clause. But this is probably the best you can do if you can't use dill.
addi
insidecalc
? Also, what os are you on? – Pensionary