What exactly is Python multiprocessing Module's .join() Method Doing?
Asked Answered
C

6

160

Learning about Python Multiprocessing (from a PMOTW article) and would love some clarification on what exactly the join() method is doing.

In an old tutorial from 2008 it states that without the p.join() call in the code below, "the child process will sit idle and not terminate, becoming a zombie you must manually kill".

from multiprocessing import Process

def say_hello(name='world'):
    print "Hello, %s" % name

p = Process(target=say_hello)
p.start()
p.join()

I added a printout of the PID as well as a time.sleep to test and as far as I can tell, the process terminates on its own:

from multiprocessing import Process
import sys
import time

def say_hello(name='world'):
    print "Hello, %s" % name
    print 'Starting:', p.name, p.pid
    sys.stdout.flush()
    print 'Exiting :', p.name, p.pid
    sys.stdout.flush()
    time.sleep(20)

p = Process(target=say_hello)
p.start()
# no p.join()

within 20 seconds:

936 ttys000    0:00.05 /Library/Frameworks/Python.framework/Versions/2.7/Reso
938 ttys000    0:00.00 /Library/Frameworks/Python.framework/Versions/2.7/Reso
947 ttys001    0:00.13 -bash

after 20 seconds:

947 ttys001    0:00.13 -bash

Behavior is the same with p.join() added back at end of the file. Python Module of the Week offers a very readable explanation of the module; "To wait until a process has completed its work and exited, use the join() method.", but it seems like at least OS X was doing that anyway.

Am also wondering about the name of the method. Is the .join() method concatenating anything here? Is it concatenating a process with it's end? Or does it just share a name with Python's native .join() method?

Cousingerman answered 19/8, 2014 at 18:59 Comment(8)
as far as i know, it holds the main thread and wait for the child process to complete and then join back the resources in the main thread, mostly does a clean exit.Poisoning
ah that makes sense. So it the actual CPU, Memory resources are being separated from the parent process, then joined back again after the child process has completed?Cousingerman
yes, that is what its doing. So, if you don't join them back, when the child process is finished it just lies as a defunct or dead processPoisoning
@Poisoning That's not true. The child processes will be implicitly joined when the main process completes.Clotilda
@Clotilda , I am also learning python and i just shared what i found in my tests, in my tests i had a never ending main process so maybe that why i saw those child processes as defunct.Poisoning
@Poisoning Yes, once they complete, the child processes will show up as zombies until the main process exits (or join() is called explicitly.)Clotilda
@MikeiLL, is your question about OS X specifically, or multiprocessing in general?Paulenepauletta
@BrianCain well, about the Python multiprocessing module.Cousingerman
C
170

The join() method, when used with threading or multiprocessing, is not related to str.join() - it's not actually concatenating anything together. Rather, it just means "wait for this [thread/process] to complete". The name join is used because the multiprocessing module's API is meant to look as similar to the threading module's API, and the threading module uses join for its Thread object. Using the term join to mean "wait for a thread to complete" is common across many programming languages, so Python just adopted it as well.

Now, the reason you see the 20 second delay both with and without the call to join() is because by default, when the main process is ready to exit, it will implicitly call join() on all running multiprocessing.Process instances. This isn't as clearly stated in the multiprocessing docs as it should be, but it is mentioned in the Programming Guidelines section:

Remember also that non-daemonic processes will be automatically be joined.

You can override this behavior by setting the daemon flag on the Process to True prior to starting the process:

p = Process(target=say_hello)
p.daemon = True
p.start()
# Both parent and child will exit here, since the main process has completed.

If you do that, the child process will be terminated as soon as the main process completes:

daemon

The process’s daemon flag, a Boolean value. This must be set before start() is called.

The initial value is inherited from the creating process.

When a process exits, it attempts to terminate all of its daemonic child processes.

Clotilda answered 19/8, 2014 at 19:7 Comment(3)
I was understanding that p.daemon=True was for "starting a background process that runs without blocking the main program from exiting". But if "The daemon process is terminated automatically before the main program exits", what exactly is its use?Cousingerman
@Cousingerman Basically anything you want going on in the background for as long as the parent process is running, but that doesn't need to be cleaned up gracefully prior to exiting the main program. Perhaps a worker process that reads data from a socket or hardware device, and feeds that data back to the parent via a queue or processes it in the background for some purpose? In general I would say that using a daemonic child process isn't very safe, because the process is going to get terminated without allowing for cleaning up any open resources it may have.. (cont).Clotilda
@Cousingerman A better practice would be to signal the child to clean up and exit prior to exiting the main process. You might think it would make sense to leave the daemonic child process running when the parent exits, but keep in mind that the multiprocessing API is designed to mimic the threading API as closely as possible. Daemonic threading.Thread objects are terminated as soon as the main thread exits, so daemonic multiprocesing.Process objects behave the same way.Clotilda
W
50

Without the join(), the main process can complete before the child process does. I'm not sure under what circumstances that leads to zombieism.

The main purpose of join() is to ensure that a child process has completed before the main process does anything that depends on the work of the child process.

The etymology of join() is that it's the opposite of fork, which is the common term in Unix-family operating systems for creating child processes. A single process "forks" into several, then "joins" back into one.

Wesley answered 19/8, 2014 at 19:5 Comment(6)
It uses the name join() because join() is what's used to wait for a threading.Thread object to complete, and the multiprocessing API is meant to mimic the threading API as much as possible.Clotilda
Your second statement addresses the problem I'm dealing with in a current project.Cousingerman
I understand the part where the main thread waits for the sub-process to complete, but doesn't that sort of defeat the purpose of Asynchronous execution? Isn't it supposed to finish execution, independently (the sub-task or process)?Kenspeckle
@ApurvaKunkulol Depends on how you're using it, but join() is needed in the case where the main thread needs the results of the sub-threads' work. For example, if you're rendering something and assign 1/4 of the final image to each of 4 subprocesses, and want to display the entire image when it's done.Wesley
@RussellBorogove Ah! I get it. Then the meaning of Asynchronous activity is a little different here. It must mean only the fact that the sub-processes are meant to carry out their tasks simultaneously with the main thread while the main thread also does it's job instead of just idly waiting on the sub-processes.Kenspeckle
That's not a very useful distinction to make, in my opinion.Wesley
C
20

I'm not going to explain in detail what join does, but here's the etymology and the intuition behind it, which should help you remember its meaning more easily.

The idea is that execution "forks" into multiple processes of which one is the main/primary process, the rest workers (or minor/secondary). When the workers are done, they "join" the main process so that serial execution may be resumed.

The join() causes the main process to wait for a worker to join it. The method might better have been called "wait", since that's the actual behavior it causes in the master (and that's what it's called in POSIX, although POSIX threads call it "join" as well). The joining only occurs as an effect of the threads cooperating properly, it's not something the main process does.

The names "fork" and "join" have been used with this meaning in multiprocessing since 1963.

Cincinnatus answered 19/8, 2014 at 18:59 Comment(2)
So in a way this use of the word join may have preceded it's use in referring to concatenation, as opposed to the other way around.Cousingerman
It's unlikely that the use in concatenation derived from the use in multiprocessing; rather both senses derive separately from the plain-English sense of the word.Wesley
D
9

The join() call ensures that subsequent lines of your code are not called before all the multiprocessing processes are completed.

For example, without the join(), the following code will call restart_program() even before the processes finish, which is similar to asynchronous and is not what we want (you can try):

num_processes = 5

for i in range(num_processes):
    p = multiprocessing.Process(target=calculate_stuff, args=(i,))
    p.start()
    processes.append(p)
for p in processes:
    p.join() # call to ensure subsequent line (e.g. restart_program) 
             # is not called until all processes finish

restart_program()
Dasteel answered 1/4, 2020 at 9:21 Comment(0)
C
3

join() is used to wait for the worker processes to exit. One must call close() or terminate() before using join().

Like @Russell mentioned join is like the opposite of fork (which Spawns sub-processes).

For join to run you have to run close() which will prevent any more tasks from being submitted to the pool and exit once all tasks complete. Alternatively, running terminate() will just exit by stopping all worker processes immediately.

"the child process will sit idle and not terminate, becoming a zombie you must manually kill" this is possible when the main (parent) process exits but the child process is still running and once completed it has no parent process to return its exit status to.

Curbing answered 20/2, 2018 at 6:39 Comment(0)
J
2

To wait until a process has completed its work and exited, use the join() method.

and

Note It is important to join() the process after terminating it in order to give the background machinery time to update the status of the object to reflect the termination.

This is a good example helped me understand it: here

One thing I noticed personally was my main process paused until the child had finished its process using the join() method which defeated the point of me using multiprocessing.Process() in the first place.

Jennefer answered 9/7, 2020 at 9:16 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.