vutran1710/PyrateLimiter

SQLite locks

Closed this issue ยท 10 comments

When calling each one of the following functions from different REPLs, an error occurs OperationalError: database is locked.
Once, it worked (but did not last after the delay) and the total requests before the first delay for both REPLs were 4075 (+75).

limiter = Limiter(
    RequestRate(4000, Duration.MINUTE),
    bucket_class=SQLiteBucket,
    bucket_kwargs={'path': '/tmp/pyrate_limiter.sqlite'},
)

@limiter.ratelimit("test", delay=True)
def api_limit():
    pass

def func1():
    # Imported and called from REPL1
    count = 1
    while True:
        api_limit()
        print(count)
        count += 1

def func2():
    # Imported and called from REPL2
    count = 1
    while True:
        api_limit()
        print(count)
        count += 1

Is it relevant to #61 ?

Yes, this should be fixed by #61.

Version 2.6.3 fixed the sum of the requests from both terminals (in case they run simultaneously).

However the error about the locked database pops up and disrupts (or most commonly blocks) the simultaneous run of the terminals.

The error on a "Debian GNU/Linux 11 (bullseye)" box (expires on May 2022): https://pastebin.com/SHhLnSin

In case it helps, the above code works without problems in the deckar01 fork of ratelimit

Okay, I see. #61 mainly made the SQLite bucket thread-safe. It's also now compatible with ProcessPoolExecutor, which spawns child processes from one parent process.

I just realized that dealing with multiple unrelated processes, like in this example, is a separate (and more difficult) problem.

Thank you for the prompt answer.

In my case, I am trying to replicate the occasion where a cron job (for example) initiates a process while the previous one has not yet finished.

That makes sense. I think a file-level lock like py-filelock will be the only way to make the SQLite backend work correctly from multiple scripts, while also supporting all the existing features of this library (mainly, tracking several different rate limits at once). I'll look into that.

It appears that in the issue you linked, that library is working around the problem by catching and ignoring OperationalError: https://github.com/deckar01/ratelimit/blob/master/ratelimit/decorators.py#L120

It's also using a different algorithm (sliding window), so it has a fewer database operations and in a different order, which may make multiprocess errors less likely to come up. I don't believe that eliminates the problem, though; with a faster request rate and/or leaving your scripts running for long enough, they will eventually make more requests than they're supposed to. Either way, the same approach wouldn't work for this library.

Oh! The code seems to be a kind of soft hack around the issue.

As a matter of fact that fork was created to apply the sliding window algorithm and try to solve the same very issues for the apparently unmaintained parent library.

Indeed py-filelock could prove to be py-substantially helpful!

I think I have this working in my branch for #68.

Here's a slightly modified example to test this:

import sys
from time import perf_counter as time

from pyrate_limiter import Limiter, RequestRate, Duration, FileLockSQLiteBucket

limiter = Limiter(
    RequestRate(100, Duration.SECOND),
    RequestRate(1000, Duration.MINUTE),
    bucket_class=FileLockSQLiteBucket,
)


@limiter.ratelimit("test", delay=True)
def send_request():
    """Rate-limited function representing a single request"""


def test_ratelimit(n_process=0):
    """Continuously send rate-limited requests"""
    start_time = time()
    n_requests = 1

    while True:
        send_request()
        print(f"[Process {n_process}] {time() - start_time:.2f}: {n_requests}")
        n_requests += 1


if __name__ == "__main__":
    n_process = sys.argv[1] if len(sys.argv) > 1 else 0
    test_ratelimit(n_process)

Either run from separate terminals, or as background processes:

python filelock_test.py 1 &
python filelock_test.py 2 &
python filelock_test.py 3 &
...

# Kill background processes when done:
pkill -f filelock_test.py

I ran this for several minutes with 0 extra requests. And I added an automated test that basically does the same thing, except with multiprocess.Process (but no shared Limiter object).

@raratiru @SpiffSpaceman
FileLockSQLiteBucket is now available in v2.7. Give that a try and let me know if that works for you.

@JWCook Thank you, it works great!

yes i ran it few times, seems to be fine . Thanks