Python multiprocess profiling
Asked Answered
C

3

55

I want to profile a simple multi-process Python script. I tried this code:

import multiprocessing
import cProfile
import time
def worker(num):
    time.sleep(3)
    print 'Worker:', num

if __name__ == '__main__':
    for i in range(5):
        p = multiprocessing.Process(target=worker, args=(i,))
        cProfile.run('p.start()', 'prof%d.prof' %i)

I'm starting 5 processes and therefore cProfile generates 5 different files. Each log only shows what happens inside the start method. How can I get logs that profile the worker function (and show that it took approximately 3 seconds in each case)?

Colour answered 14/6, 2012 at 21:33 Comment(0)
P
31

You're profiling the process startup, which is why you're only seeing what happens in p.start() as you say—and p.start() returns once the subprocess is kicked off. You need to profile inside the worker method, which will get called in the subprocesses.

Psychoactive answered 14/6, 2012 at 21:36 Comment(4)
Wow! I feel dumb :) this was so simple. Thank you very much! And thank you for the quick response.Colour
You bet. And thank you for taking good care of your question!Psychoactive
could you illustrate how to do that?Oday
@DavoudTaghawi-Nejad See the "Working example" in the question.Psychoactive
I
11

It's not cool enough having to change your source code for profiling. Let's see what your code is supposed to be like:

import multiprocessing
import time
def worker(num):
    time.sleep(3)
    print('Worker:', num)

if __name__ == '__main__':
    processes = []
    for i in range(5):
        p = multiprocessing.Process(target=worker, args=(i,))
        p.start()
        processes.append(p)
    for p in processes:
        p.join()

I added join here so your main process will wait for your workers before quitting.

Instead of cProfile, try viztracer.

Install it by pip install viztracer. Then use the multiprocess feature

viztracer --log_multiprocess your_script.py

It will generate an html file showing every process on a timeline. (use AWSD to zoom/navigate)

result of script

Of course this includes some info that you are not interested in(like the structure of the actual multiprocessing library). If you are already satisfied with this, you are good to go. However, if you want a clearer graph for only your function worker(). Try log_sparse feature.

First, decorate the function you want to log with @log_sparse

from viztracer import log_sparse

@log_sparse
def worker(num):
    time.sleep(3)
    print('Worker:', num)

Then run viztracer --log_multiprocess --log_sparse your_script.py

sparse log

Only your worker function, taking 3s, will be displayed on the timeline.

Ichthyornis answered 12/11, 2020 at 20:36 Comment(1)
make sure you tune --tracer_entries, so that you can open your file in the browser (otherwise tab always crashes for me) and at the same time --max_stack_depth so that all relevant data are there and are not overwritten by other (I saw only one call of my function when in fact there were many).Ricebird
F
3

If you have a complex processes structure and you want to profile a particular portion of the code, or maybe the particular working core of the process you can point to the profiler to collect stats there (see enable and disable methods https://docs.python.org/3.6/library/profile.html#module-cProfile). This is what you can do:

import cProfile

def my_particular_worker_code()
    pr = cProfile.Profile()
    pr.enable()

    # Process stuff to be profiled

    pr.disable()
    pr.print_stats(sort='tottime')  # sort as you wish

You can drop the reports to a file as well.

Farr answered 26/3, 2021 at 14:1 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.