asg017/sqlite-rembed

Rate limiting options

asg017 opened this issue · 2 comments

asg017 commented

Different providers have different limits, so might be hard to coordinate. Will also need settings to disable/alter limits for self-hosted options.

simonw commented

Some providers even return HTTP headers that inform the client of the current rate limit remaining and when it would be reset, so it would be possible to automatically throttle requests to those providers (sleep automatically until the next "reset" point).

A call to the OpenAI embeddings API for example returns this:

x-ratelimit-limit-requests: 5000
x-ratelimit-limit-tokens: 5000000
x-ratelimit-remaining-requests: 4999
x-ratelimit-remaining-tokens: 4999990
x-ratelimit-reset-requests: 12ms
x-ratelimit-reset-tokens: 0s
asg017 commented

Oooh interesting, that would be way better than trying to wrap some rust rate limiter around every client. Will score through all these clients and see which ones support that besides openai