Fix jumping of multiple progress bars (tqdm) in python multiprocessing
Asked Answered
E

2

6

I want to parallelize a task (progresser()) for a range of input parameters (L). The progress of each task should be monitored by an individual progress bar in the terminal. I'm using the tqdm package for the progress bars. The following code works on my Mac for up to 23 progress bars (L = list(range(23)) and below), but produces chaotic jumping of the progress bars starting at L = list(range(24)). Has anyone an idea how to fix this?

from time import sleep
import random
from tqdm import tqdm
from multiprocessing import Pool, freeze_support, RLock

L = list(range(24)) # works until 23, breaks starting at 24

def progresser(n):
    text = f'#{n}'

    sampling_counts = 10
    with tqdm(total=sampling_counts, desc=text, position=n+1) as pbar:
        for i in range(sampling_counts):
            sleep(random.uniform(0, 1))
            pbar.update(1)

if __name__ == '__main__':
    freeze_support()

    p = Pool(processes=None,
                initargs=(RLock(),), initializer=tqdm.set_lock
                )
    p.map(progresser, L)
    print('\n' * (len(L) + 1))

As an example of how it should look like in general, I provide a screenshot for L = list(range(16)) below.

multiprocessing progess bars

versions: python==3.7.3, tqdm==4.32.1

Estey answered 19/6, 2019 at 10:33 Comment(0)
S
6

I'm not getting any jumping when I set the size to 30. Maybe you have more processors and can have more workers running.

However, if n grows large you will start to see jumps because of the nature of the chunksize.

I.e p.map will split your input into chunksizes and give each process a chunk. So as n grows larger, so does your chunksize, and so does your ....... yup position (pos=n+1)!

Note: Although map preserves the order of the results returned. The order its computed is arbitrary.

As n grows large I would suggest using processor id as the position to view progress on a per process basis.

from time import sleep
import random
from tqdm import tqdm
from multiprocessing import Pool, freeze_support, RLock
from multiprocessing import current_process


def progresser(n):
    text = f'#{n}'
    sampling_counts = 10
    current = current_process()
    pos = current._identity[0]-1

    with tqdm(total=sampling_counts, desc=text, position=pos) as pbar:
        for i in range(sampling_counts):
            sleep(random.uniform(0, 1))
            pbar.update(1)

if __name__ == '__main__':
    freeze_support()
    L = list(range(30)) # works until 23, breaks starting at 24
    # p = Pool(processes=None,
    #         initargs=(RLock(),), initializer=tqdm.set_lock
    #         )
    with Pool(initializer=tqdm.set_lock, initargs=(tqdm.get_lock(),)) as p: 
        p.map(progresser, L)
        print('\n' * (len(L) + 1))
Sydelle answered 3/11, 2019 at 19:34 Comment(1)
Gettiong an empty tuple on the identity on py 3.10Upstream
B
0

I don't think it is a problem with parallelism, but rather with earlier progress bars changing the "zero position" of the cursor when they finish. I am not sure how to achieve what OP wanted (keeping logs of every subtask) and if this is even achievable with the current implementation of tqdm. Although, if showing progress by worker id and not by task id (as @shan-l suggested), I find using leave=False (read the doc) practical as it won't have the same jumping behavior, and leave the terminal clean afterward.

To remediate to finished tasked having their progress bar cleared, you can use a TOTAL progress bar that remains, updated using pool.imap instead of results = pool.map(progresser, L), but you can also use pool.imap_unordered if you prefer the TOTAL bar to be updated as soon as any task is finished over the order of the results.

By the way, passing the tqdm lock using initializer=tqdm.set_lock, initargs=(tqdm.get_lock(),) (the reason I was pointed out to this answer) didn't always work for me, in particular with the "spawn" context. Here is a solution that fixes it:

import time, random
from tqdm import tqdm
from multiprocessing import get_context, Pool, RLock, freeze_support, current_process

def progresser(n, sampling_counts=100):
    worker_id = current_process()._identity[0]-1
    for _ in tqdm(
        range(sampling_counts),
        desc=f"#{n} on {worker_id}",
        position=worker_id+1,       # note the "+1" for the TOTAL progress bar
        leave=False,                # the progress bar will be cleared up and the cursor position unchanged when finished
    ):
        time.sleep(random.uniform(0, 1/sampling_counts))
    return n

if __name__ == '__main__':
    # freeze_support()              # this worked for me without it
    L = list(range(200))

    # ctx = get_context('fork')
    # lock = tqdm.get_lock()
    ctx = get_context('spawn')
    lock = ctx.RLock()              # I had to create a fresh lock

    tqdm.set_lock(lock)
    with ctx.Pool(
        # processes=10,             # os.cpu_count() workers if not specified
        initializer=tqdm.set_lock, initargs=(lock,),
    ) as pool:
        results = []
        for result in tqdm(
            pool.imap(progresser, L), total=len(L),
            desc=f"TOTAL",
        ):
            results.append(result)

It looks something like this, where only the TOTAL progress bar remains at the end of execution:

TOTAL:  60%|████████████████████▌             | 121/200 [00:03<00:02, 34.16it/s]
#136 on 0:  91%|████████████████████████▌  | 910/1000 [00:00<00:00, 1794.29it/s]
#129 on 1:  87%|███████████████████████▍   | 868/1000 [00:00<00:00, 1732.13it/s]
#135 on 2:  89%|████████████████████████▏  | 894/1000 [00:00<00:00, 1817.64it/s]
#143 on 3:   0%|                                       | 0/1000 [00:00<?, ?it/s]
#140 on 4:   0%|                                       | 0/1000 [00:00<?, ?it/s]
#132 on 5:  91%|████████████████████████▌  | 911/1000 [00:00<00:00, 1775.48it/s]
#127 on 6:  88%|███████████████████████▊   | 883/1000 [00:00<00:00, 1754.60it/s]
#131 on 7:  89%|███████████████████████▉   | 886/1000 [00:00<00:00, 1775.83it/s]
#141 on 8:   0%|                                       | 0/1000 [00:00<?, ?it/s]
#144 on 9:   0%|                                       | 0/1000 [00:00<?, ?it/s]
#146 on 10:   0%|                                      | 0/1000 [00:00<?, ?it/s]
#137 on 11:  90%|███████████████████████▎  | 896/1000 [00:00<00:00, 1744.15it/s]
#142 on 12:   0%|                                      | 0/1000 [00:00<?, ?it/s]
#138 on 13:  92%|███████████████████████▊  | 918/1000 [00:00<00:00, 1756.40it/s]
#133 on 14:  90%|███████████████████████▍  | 903/1000 [00:00<00:00, 1770.85it/s]
#126 on 15:  92%|███████████████████████▊  | 917/1000 [00:00<00:00, 1767.52it/s]
#134 on 16:  92%|███████████████████████▉  | 920/1000 [00:00<00:00, 1826.77it/s]
#128 on 17:  90%|███████████████████████▍  | 903/1000 [00:00<00:00, 1756.97it/s]
#145 on 18:   0%|                                      | 0/1000 [00:00<?, ?it/s]
#139 on 19:  72%|██████████████████▋       | 720/1000 [00:00<00:00, 1777.66it/s]
Blowfish answered 2/5 at 5:0 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.