peterkhayes/rolling-rate-limiter

I like this lib

ORESoftware opened this issue · 12 comments

(I removed this comment because it was counterproductive)

After looking through the source and reading about the design of this lib from your article I have to ask about why you chose this algorithm over a perhaps much more simple one. The short version of this is - we don't care how many events happen during a window, we just care if there are more than the maximum - so we don't need to keep track of all the events - we just keep a list of all the events below the max, which usually keeps the limit to a low and expected/known number.

Perhaps a simpler way to implement a rolling window is, given:

var reqWindow = 2000; // ms
var maxRequestsPerWindow = 50;
var identifier = req.ip; // some identifier of the requestor

what you do in a simple case is store an array of no more than maxRequestsPerWIndow (50 timestamps) in a database as a value (just use JSON stringify with Redis), with the key being the identifier of the request. If there are fewer than 50 timestamps, we do nothing except push a new timestamp. If there are 50 or more timestamps than we shift the oldest timestamp and then compare it with Date.now() - if the time difference is less than the window, we reject the latest request, but we still push the new event (the Date.now() timestamp) onto the array.

I am pretty certain that is all the logic required for this. Although it would be great if you could point out where I might be wrong or misunderestimating something. It seems like this implementation is way more complicated than the above. Thanks.

To check how many items are in the array, we need to make a request to Redis. In the time it will take for this request to return, and for us to make a decision about what to do, other servers may have made similar requests. This creates the possibility for race conditions, unless all actions can be completed in a single trip to Redis. Hence the current implementation.

Thanks for the response, sorry I was being a little flippant early but I am genuinely interested in learning more and I am not sure I follow 100% and am curious. I think in ideal circumstances the rate limiting would be done by a load balancer somehow. But perhaps race conditions could still occur with a single load balancer? Perhaps you're saying that a first request might come in and moments later a second request might come in and due to the nature of networks, the second request might get through but the first request would not?

Of course, if you do rate limiting on each node in a deployment then you might have to assume equal distribution of requests from a certain source/IP across all nodes, and that can get fuzzy, so I guess it's best to just discuss what goes on when rate limiting is implemented on the load balancer itself, or some single server instance.

We do rate limiting on each node, but with reference to a shared Redis instance. I haven't looked into rate limiting on the load balancer, but our load balancers (haProxy and ELB) are not written in Node.js, and so other libraries would need to be used.

IF only one instance is performing the rate limiting (i.e. a node server acting as a load balancer), then you're correct and your simpler algorithm could be used. However, in these cases, Redis would not really be needed either, and the whole thing could be done with an array in memory.

100% agree on the load balancing thing; also, there is probably need for some rate limiting logic on each node, so there is probably good reason to keep that code on each node, not on the load balancer.

Ok, so if you have only one server node, but 10,000 users. You don't want to keep an array in memory for each user, so you still want to use Redis for that. Race conditions could still occur here, though, if a user submits two requests that are just moments apart, because you have go to Redis for both requests.

If you have multiple nodes, and you use the same Redis instance, you basically have the same type of race condition as you do above !! Except the user's sequential requests happen to go to two separate nodes instead of the same node.

So I don't follow yet.

Would love to know if I am wrong or missing something on this

Is your concern that of the two requests, the first one could be blocked while the second one allowed? This it not the race condition this library seeks to eliminate, and could certainly happen. What won't happen though is if the user is 1 request under the limit, both will never be allowed.

A two step 'read-then-write' algorithm would be subject to this type of race condition and would potentially allow more requests through than it should.

Please tell me what race condition you are trying to eliminate.
On Apr 11, 2016 5:51 PM, "Nick Bottomley" notifications@github.com wrote:

Is your concern that of the two requests, the first one could be blocked
while the second one allowed? This it not the race condition this library
seeks to eliminate, and could certainly happen. What won't happen though is
if the user is 1 request under the limit, both will never be allowed.


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub
#8 (comment)

Under a non-atomic algorithm (read-then-write), client A could ask how many requests are left. The reply is 1. Client A receives the response, and sees that it is allowed to go ahead with it's work. So it issues another request to tell Redis to increment the counter. Meanwhile, client B also asks how many requests are left. The increment from client A hasn't yet arrived, so the server replies '1'. Thus, the rate limit has been exceeded: both clients thought they were allowed to proceed, where one should have been forbidden.

Under this algorithm, only a single trip to Redis is needed, ensuring the situation described above can't happen. This does lead to some particular behavior (i.e. if a user continuously exceeds the limit, none of their requests get through) that, in our case, isn't a problem.

Thanks for the info. Ok, if I understand correctly - you are not rate limiting by user/client, but you are rate limiting overall, lumping all clients together?
Furthermore you have some scheme by which it's acceptable for a client to ask how many requests they have left?
You may be able to eliminate the race condition by rate limiting by user/client. Furthermore, the whole thing where you allow users to ask for how many requests they have remaining. I am not sure how that makes sense. How many requests they have left in which period? In the next 5 seconds, 10 seconds? Furthermore, doing this read then write, is akin to using fs.exists before fs.readFile; the experts say: just read try to read the file and if it fails then handle the error; if you read before your read, you introduce race conditions, by the very nature of the thing.

http://stackoverflow.com/questions/28532820/fs-exists-fs-existssync-why-are-they-deprecated

You most likely have some special use case for this module. I don't want to waste your time because I am not an expert in rate limiting. I am simply curious and saw "the article that Peter wrote" online which purported this to be a general purpose rate-limiting algorithm. But from what I have heard it's not that general purpose if it doesn't rate limit by user? Looking at the code though it does appear to do so. I appreciate the info because it is an interesting topic, thanks

This module can be used to limit any resource in nearly anyway. It is up to the caller to generate the appropriate Redis key. Thus, a key could be 123:some-api-endpoint limiting user with id 123 access to the some-api-endpoint. Whenever a call for that API endpoint comes in from that user, call the limiter with that key.

You are right, you can't query the limit without changing it. For a public, developer-facing API with rate limits, that would be a crucial drawback. However, we use this to limit overuse of undocumented API's by malicious or misbehaving clients.

I take your point about fs.exists however the crucial part is that we are implementing this using existing Redis data structures and commands (which dramatically constrain what kinds of algorithms we can use) so that we can rely directly on a high-quality battle-tested piece of open source software rather than rolling our own rate limiting service.

Interesting approach in #8 (comment).

It is correct that this approach could be done within a redis script, which is atomic by nature.

The question is:

Is ZREMRANGEBYSCORE + ZCARD + ZADD

faster

or

ZRANGEBYSCORE() LIMIT [amount] + [amount] * ZREM() + ZADD()

--

I think it depends on the number of items in the set for high N's, e.g 200000 req/hour the second is likely more efficient, for low N it does not matter.