Python doctest hangs using ProcessPoolExecutor
Asked Answered
T

4

5

This code runs fine under regular CPython 3.5:

import concurrent.futures

def job(text):
    print(text)

with concurrent.futures.ProcessPoolExecutor(1) as pool:
    pool.submit(job, "hello")

But if you run it as python -m doctest myfile.py, it hangs. Changing submit(job to submit(print makes it not hang, as does using ThreadPoolExecutor instead of ProcessPoolExecutor.

Why does it hang when run under doctest?

Tomkins answered 12/1, 2018 at 2:55 Comment(1)
Any update/feedback on the answer I posted?Niue
N
8

So I think the issue is because of your with statement. When you have below

with concurrent.futures.ProcessPoolExecutor(1) as pool:
    pool.submit(job, "hello")

It enforces the thread to be executed and closed then an there itself. When you run this as main process it works and gives time for thread to execute the job. But when you import it as a module then it doesn't give the background thread a chance and the shutdown on the pool waits for the work to be executed and hence a deadlock

So the workaround that you can use is below

import concurrent.futures

def job(text):
    print(text)

pool = concurrent.futures.ProcessPoolExecutor(1)
pool.submit(job, "hello")

if __name__ == "__main__":
    pool.shutdown(True)

This will prevent the deadlock and will let you run doctest as well as import the module if you want

Niue answered 12/4, 2018 at 6:38 Comment(2)
This answer is a little misleading, because the problem is not with the with statement. You can reproduce this behaviour without the with statement by doing pool = ...ProcessPoolExecutor() pool.submit(...) pool.shutdown(). The problem is the import lock, as I note in my answer.Interdental
@daphtdazz, I do agree with you. I was not aware of https://docs.python.org/3/library/imp.html#imp.lock_held to quote that in my answer, I just knew it is a import deadlock. When I said the with statement is the issue, I meant that the __exit__ of the ProcessPoolExecutor will execute the shutdown method and cause the deadlock with import. Your answer explains one layer below mine. Both are correct in their own context. You explained why it doesn't work and I explained how to make it work.Niue
I
7

The problem is that importing a module acquires a lock (which lock depends on your python version), see the docs for imp.lock_held.

Locks are shared over multiprocessing so your deadlock occurs because your main process, while it is importing your module, loads and waits for a subprocess which attempts to import your module, but can't acquire the lock to import it because it is currently being imported by your main process.

In step form:

  1. Main process acquires lock to import myfile.py
  2. Main process starts importing myfile.py (it has to import myfile.py because that is where your job() function is defined, which is why it didn't deadlock for print()).
  3. Main process starts and blocks on subprocess.
  4. Subprocess tries to acquire lock to import myfile.py

=> Deadlock.

Interdental answered 17/4, 2018 at 14:28 Comment(0)
O
0

doctest imports your module in order to process it. Try adding this to prevent execution on import:

if __name__ == "__main__":
    with concurrent.futures.ProcessPoolExecutor(1) as pool: 
        pool.submit(job, "hello")
Onus answered 12/1, 2018 at 6:50 Comment(2)
That sidesteps the problem by preventing the code from running all together. But I don't want to prevent the code from running, I want to prevent it from hanging.Tomkins
The code should run when the module is loaded (e.g. by doctest, or regular import), or run as a standalone script.Tomkins
N
0

This should actually be a comment, but it's too long to be one.

Your code fails if it's imported as a module too, with the same error as doctest. I get _pickle.PicklingError: Can't pickle <function job at 0x7f28cb0d2378>: import of module 'a' failed (I named the file as a.py).

Your lack of if __name__ == "__main__": violates the programming guidelines for multiprocessing: https://docs.python.org/3.6/library/multiprocessing.html#the-spawn-and-forkserver-start-methods

I guess that the child processes will also try to import the module, which then tries to start another child process (because the pool unconditionally executes). But I'm not 100% sure about this. I'm also not sure why the error you get is can't pickle <function>.

The issue here seems to be that you want the module to auto start a process on import. I'm not sure if this is possible.

Nebraska answered 11/4, 2018 at 17:53 Comment(2)
I see what you're saying. Still, the problem is that I want to be able to launch a ProcessPoolExecutor within a doctest. That is what I can't get to work. Simply hiding all the code under if name == "main" doesn't work, because that prevents the code from ever running (under doctest).Tomkins
Why not put the code for the ProcessPoolExecutor in the doctest string so it runs it as a test? Or is there some other use case?Nebraska

© 2022 - 2024 — McMap. All rights reserved.