custom rates support

Question

custom rates support

enesakar opened this issue a year ago · 21 comments

in the ratelimit sdk, is there a way consume with different rates? e.g. in openai api, I will allow 100 words per hour so I need to count the words in the prompt and consume it. so something like:
ratelimit.limit(identifier, wordCount)

ogzhanolguncu commented a year ago

bump

Answer 1 · 2024-01-11T01:50:34.000Z

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 30 days.

Answer 2 · 2024-02-10T11:06:40.000Z

@enesakar @ogzhanolguncu I was able to create this feature, I only did it for fixed window for now and I tested it as well it works fine. But I want to know how many such rates can we expect from a user, I mean a user can add only one extra rate(like you have mentioned) or there can be many of them?

What I am planning to do is add an object of custom rates having unique names and max values, so the instance will be like this:

const ratelimit = new Ratelimit({
  redis: kv,
  limiter: Ratelimit.fixedWindow(10, "60s", {
    words: 100,
  }),
});

You can only send prompts of words at max 100 in the given window. Now to limit we can do something like this:

export default async function Home() {
  const ip = headers().get("x-forwarded-for");
  const { success, limit, remaining, reset } = await ratelimit.limit(
    ip ?? "anonymous",
    undefined,
    {
      words: 20,
    }
  );

this will increment the word count by 20(will be dynamic based on the prompt) at every request.

What do you think? Is there any better approach then please let me know

PS: Refactored the snippets and removed array args

Answer 3 · 2024-02-12T07:23:57.000Z

I feel like I still don't understand the practicality of this. What problem are we exactly trying to solve? @sourabpramanik, can you walk me through your example case so I can have a better understanding of the situation?

Answer 4 · 2024-02-12T07:51:58.000Z

In AI APIs the quotas are per token not per request. I need to limit the rate by token count. so each request has different token/weight. currently this is not possible as each request is one.

Answer 5 · 2024-02-12T07:57:38.000Z

Sure, so we are already rate-limiting based on only one factor that is per request. Now we want to have some custom limiters like for example word counts. So these custom limiters can be of any type, having different rates, limits, and as many as the user wants to have. At each request, the counter will increase at a specific rate whichever factor or rate hits the limit first in the given window will block further requests.

I have modified the fixed window algorithm for now such that it can work for these cases. If you want I can raise a PR so you may understand better

Answer 6 · 2024-02-15T00:47:34.000Z

Sorry for the delay, @enesakar @ogzhanolguncu please review the PR and give your feedback.

Answer 7 · 2024-02-18T15:22:26.000Z

A good use case for this feature is for subtracting credits from a user's account. Currently the limit function always subtract 1 credit from the account.
@sourabpramanik Looking at the API proposal above, I think it's a bit complicated. Is it possible for the API to simply ask for number (e.g. credits) to subtract?
That would be the cleanest API for me. You might want to subtract 1.2 credits sometimes, or sometimes 0.5, or sometimes 10. If float is not allowed, int is also fine.

Answer 8 · 2024-02-18T15:29:06.000Z

A good use case for this feature is for subtracting credits from a user's account. Currently the limit function always subtract 1 credit from the account. @sourabpramanik Looking at the API proposal above, I think it's a bit complicated. Is it possible for the API to simply ask for number (e.g. credits) to subtract? That would be the cleanest API for me. You might want to subtract 1.2 credits sometimes, or sometimes 0.5, or sometimes 10.

Hey @off99555 by credits do you mean tokens? If yes then when you init the RateLimit you need to specify the max tokens and the limit function call will subtract tokens which can be dynamic and not necessarily be one token, but yes I have to check the case of floating tokens maybe rounding off will be much better. I hope I got you right.

Checkout the example usage
https://github.com/sourabpramanik/upstash-ratelimit/blob/64dd2c6273c369bac431a6ab278b0626ad1127ff/examples/with-vercel-kv/app/page.tsx#L8C1-L26C1

Answer 9 · 2024-02-18T15:38:25.000Z

@sourabpramanik Can the API be like this? I think we just need a number and no words.

const { success, limit, remaining, reset } = await ratelimit.limit(
    ip ?? "anonymous",
    undefined,
    20, // a custom deduction
  );

Or instead of the number, it can also be an object {subtract: 20} to allow for future change in the API.

Answer 10 · 2024-02-18T15:43:32.000Z

Yes, we can create two variants, one variant is for implementing one custom rate without any identifier since there is only one custom rate just like you mentioned, and another variant will be for multiple custom rates which will need identifiers because we need to track the token for exhausting. What do you think?

Answer 11 · 2024-02-18T15:47:04.000Z

I think having multiple custom rates is very expensive because for every request we have to run a loop to execute the script to reduce the token before any one of them has exhausted.

Answer 12 · 2024-02-18T16:00:10.000Z

I believe we should avoid overcomplicating the API. The initial implementation proposal was fine. And, let's add a more concrete example to ensure we all on the same page. Preferably, using OpenAI with chat completion or embeddings creation.

Answer 13 · 2024-02-18T16:03:24.000Z

I believe we should avoid overcomplicating the API. The initial implementation proposal was fine. And, let's add a more concrete example to ensure we all on the same page. Preferably, using OpenAI with chat completion or embeddings creation.

Yes, the API should be simple. for my use case I simply want a way to subtract different amount of values for my AI image generation app.
For example, if the user is generating a big image, I want it to cost 2 tokens. If it's a small image, it's 1 token. Something like that. No need for named custom rates. I just need a way to send this arbitrary amount of tokens to the limit function.

Answer 14 · 2024-02-18T16:13:35.000Z

Yes I agree I will rework on the api and let you all know

Answer 15 · 2024-02-18T16:20:26.000Z

@off99555 I like the idea. Yeah, users definitely should be able to pick their own subtraction rates.

Answer 16 · 2024-02-18T17:42:57.000Z

@ogzhanolguncu @off99555 I have updated to API and the example app. Can you guys please check if this works so then I can implement the same logic in the rest of the algorithms and multi-region algorithms.

Answer 17 · 2024-02-18T19:11:21.000Z

@sourabpramanik, can you please share how the end user API will change? Perhaps you can commit the new README so we can easily see the new user API.

This is a must-have feature for AI applications (due to tokens), so your effort is greatly appreciated!

Answer 18 · 2024-02-19T00:05:01.000Z

@sourabpramanik, can you please share how the end user API will change? Perhaps you can commit the new README so we can easily see the new user API.

This is a must-have feature for AI applications (due to tokens), so your effort is greatly appreciated!

Yes definitely I will do that

Answer 19 · 2024-02-19T15:46:08.000Z

@enesakar as mentioned I have added a small doc on the change the API will have

Answer 20 · 2024-02-19T23:38:10.000Z

@sourabpramanik I added a comment.