error with module multiprocessing under python3.8
Asked Answered
B

2

8

I had a script that was multiprocessing fine until today. To reproduce the problem, I simplified the function that I parallelized with the one shown below:

    from multiprocessing import Process, Queue
    import random

    def rand_num():
        num = random.random()
        print(num)

    if __name__ == "__main__":
        queue = Queue()

        processes = [Process(target=rand_num, args=()) for x in range(4)]

        for p in processes:
            p.start()

        for p in processes:
            p.join()

that renders the exact same error message (repeated 4 times, which I omitted repeating for readability):

    Traceback (most recent call last):
    File "<string>", line 1, in <module>
    File "/usr/local/Cellar/[email protected]/3.8.1/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
    File "/usr/local/Cellar/[email protected]/3.8.1/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/spawn.py", line 125, in _main
    prepare(preparation_data)
    File "/usr/local/Cellar/[email protected]/3.8.1/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/spawn.py", line 236, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
    File "/usr/local/Cellar/[email protected]/3.8.1/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
    File "/usr/local/Cellar/[email protected]/3.8.1/Frameworks/Python.framework/Versions/3.8/lib/python3.8/runpy.py", line 262, in run_path
    code, fname = _get_code_from_file(run_name, path_name)
    File "/usr/local/Cellar/[email protected]/3.8.1/Frameworks/Python.framework/Versions/3.8/lib/python3.8/runpy.py", line 232, in _get_code_from_file
    with io.open_code(fname) as f:
    FileNotFoundError: [Errno 2] No such file or directory: '/Users/myUserName/<stdin>'

I don't know where to start debugging this error. I'm running python3.8 under mac os Catalina (homebrew install). Please help.

Blazon answered 4/3, 2020 at 2:51 Comment(12)
Cannot reproduce with your example - please read minimal reproducible example. The TraceBack message is saying it cannot find a file - so path problems or permissions problems ...Nicolle
As an aside, you should consider using a Pool, it will simplify the code.Numismatist
do you run it from file - python script.py - or from interpreter ? I see File "<string>" and <stdin> which can means you run it from interpreter and maybe it makes problem.Cykana
@wwii: thanks. I tried on a different machine (at work with High Sierra) and it works. I need it to work on my laptop (OSX Catalina). In both cases I run the script.py from iTerm2 but get different results. My guess is that it has something to do with python version. On 3.7.6 it works but not on 3.8.Blazon
@Cykana : in iTerm2 I run python3 myScript.pyBlazon
Python 3.8 is very new version and it may have bugs and some modules may not works with this version. It is good to wait few month and use older version 3.7.Cykana
Python 3.8 on MacOS by default now uses "spawn" instead of "fork" as start method for new processes. Try with multiprocessing.set_start_method("fork") in the first line below if __name__ == "__main__":.Ordain
@Darkonaut: YES ! Worked !!!Blazon
thanks @Ordain ! I had a similar issues, where basically my code broke after upgrading to 3.8 ... you should write a medium post ;) -- and add an answer below so we can vote itAspersorium
@Aspersorium You're welcome, but I don't have access to MacOS so I don't want to cover this topic really. You can read about the motives for the change here and how it might affect you.Ordain
I see the same issue on Windows 10 Python 3.8. This is not just a Mac thing. No need to start Mac rumors. :D I see the same errors with the same line numbers. This is surprising. Check it out:/Traceback (most recent call last): File "<string>", line 1, in <module> File "C:\Users\user3870315\AppData\Local\Programs\Python\Python38-32\lib\multiproce ssing\spawn.py", line 116, in spawn_main exitcode = _main(fd, parent_sentinel) File "C:\Users\user3870315\AppData\Local\Programs\Python\Python38-32\lib\multiproce ssing\spawn.py", line 125, in _main prepare(preparation_data) ...Foilsman
Also, @Ordain your solution throws another error on my computer: Traceback (most recent call last): File "C:\\workspace\\herrderr\\foobar\\hhgttg.py", line 156, in dynamic_dco multiprocessing.set_start_method("fork") File "C:\Users\user3870315\AppData\Local\Programs\Python\Python38-32\lib\multiproce ssing\context.py", line 247, in set_start_method self._actual_context = self.get_context(method) File "C:\Users\user3870315\AppData\Local\Programs\Python\Python38-32\lib\multiproce ssing\context.py", line 239, in get_ ... ValueError: cannot find context for 'fork'Foilsman
R
8

I faced the same problem when upgrading from Python 3.7 to 3.8. In particular now running 3.8.6 on OSX 10.15.6, Python installed by pyenv.

The advice from Darkonaut helped to resolve the issue but it's not so visible so let me rephrase it here:

Python 3.8 on MacOS by default now uses spawn instead of fork as start method for new processes. Try with

multiprocessing.set_start_method("fork")

Apparently the behaviour of the spawn is wrong as following simple example reveals:

import multiprocessing

def parallel_function(x):
    print("Function called with", x)

def test_pool():
    print("Running test_pool")
    with multiprocessing.Pool(4) as pool:
        pool.map(parallel_function, range(10))

print("Starting the test")
test_pool()

This produces following output:

Starting the test
Running test_pool
Starting the test
Running test_pool
Starting the test
Running test_pool
Starting the test
Running test_pool
Starting the test
Running test_pool
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Users/karel/.pyenv/versions/3.8.6/lib/python3.8/multiprocessing/spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "/Users/karel/.pyenv/versions/3.8.6/lib/python3.8/multiprocessing/spawn.py", line 125, in _main
    prepare(preparation_data)
  File "/Users/karel/.pyenv/versions/3.8.6/lib/python3.8/multiprocessing/spawn.py", line 236, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/Users/karel/.pyenv/versions/3.8.6/lib/python3.8/multiprocessing/spawn.py", line 287, in _fixup_main_from_path

So the Pool doesn't create workers properly but instead attempts to run the whole script in each spawned process.

Resendez answered 11/1, 2021 at 11:46 Comment(1)
Note that using fork instead of spawn can be problematic: github.com/python/cpython/issues/84559. There is a change planned to make spawn the default of unix systems. Your example breaks because you do not protect test_pool() from being run recurssively by calling it in a if __name__ == '__main__' block.Hinayana
H
0

Using spawn, the default for Mac and Windows but not unix, is actually advisable; there are known issues with fork: https://github.com/python/cpython/issues/84559

When using spawn, make sure to protect against recursive imports by wrapping the calling code in a if __name__ == '__main__' block.

import multiprocessing
import random
from concurrent.futures import ProcessPoolExecutor


def rand_num(i):
    num = random.random()
    print(f"worker {i} produced random number {num}")

def test_pool():
    print("Running test_pool")
    with ProcessPoolExecutor(mp_context=multiprocessing.get_context("spawn")) as pool:
        pool.map(rand_num, range(4))

if __name__ == '__main__':
    print("Starting the test")
    test_pool()
Hinayana answered 22/8, 2023 at 12:44 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.