vutran1710/PyrateLimiter

Slowing down parallelism problem (multiprocessing.pool)

Closed this issue · 3 comments

import tqdm
from multiprocessing.pool import Pool
from pyrate_limiter import (Duration, RequestRate,
                            Limiter, BucketFullException)

rate1 = RequestRate(7, Duration.SECOND) # MAX 7 runs per second
limiter = Limiter(rate1)
def Func(arg):
    import time
    with limiter.ratelimit('SomeKeyWord12',delay=True):
        time.sleep(0.1)
        return  0
args=8000*['some_dummy_arg']
with Pool(40) as p:
    results = list(tqdm.tqdm(p.imap(Func, args), total=len(args)))

100%|██████████| 8000/8000 [00:28<00:00, 281.30it/s] # actual 281 runs per second

What's wrong? Thx!

JWCook commented

The default backend just stores the current state in memory and won't work with multiprocessing, since each process will have its own copy of that state.

Either the Redis backend or SQLite + Filelock backend would work for this. See this section for more details on bucket backends: https://github.com/vutran1710/PyrateLimiter#backends

The default backend just stores the current state in memory and won't work with multiprocessing, since each process will have its own copy of that state.

Either the Redis backend or SQLite + Filelock backend would work for this. See this section for more details on bucket backends: https://github.com/vutran1710/PyrateLimiter#backends

Hello @JWCook , I want to use SQLite + Filelock backend but I do not see where lock_acquire and lock_release are used. So I have to use them manually every time I want to use a limiter?

JWCook commented

No, you don't need to manually do anything with the file lock. It's managed in the Limiter class, because it needs to call multiple bucket methods (read + write) as a single atomic operation.