kuk/log-progress

Progress bar for multiprocessing

Closed this issue · 1 comments

Currently, it's possible to do something like:

def worker(i):
    d = 0
    for _ in range(i):
        d += 1
    return d

data = [int(5e6) for _ in range(10)]

map(worker, log_progress(data, every=1))

screen shot 2018-02-21 at 08 21 31

If I try to do the same using a multiprocessing mapper, the progress bar is immediately completed.

from multiprocessing import Pool

def worker(i):
    d = 0
    for _ in range(i):
        d += 1
    return d

data = [int(5e6) for _ in range(10)]

p = Pool(2)
processes = p.map(worker, log_progress(data, every=1))
p.close()

screen shot 2018-02-21 at 08 41 31

Would it be possible to somehow make this work?

This is because the Pool.map function gathers the whole iterator into a list before applying the supplied worker function.
Pool.imap might be what you're looking for, as it returns an iterator over the results instead of blocking to create a list with all results at once.

data = log_progress(pool.imap(func, args_list), size=len(args_list))
for e in data:
    ...