The goal of this little R&D was to come up with a way for web services to leverage rate limiting as an external service. The implementation is inspired by the many researches done to find a good starting point for more production-like implementation.
- Rolling window (with redis[1]) has been chosen (knowing drawbacks, especially size of lists for dayly/monthy limits for example)
- Redis (lists[1] and model[2]) has been chosen to help with the distributed need, and scalability.
- Kong (Rate Limiting Plugin[3], Enterprise version[4]) provided a LUA implementation of this, in an API Gateway setting (so different from here)
-
Create application in Okta with
client_credentials
grant type at least and addcustomScope
scope. Keep the domain (something like https://dev-abcd.okta.com), client id and secret handy to use those to get a token later on. -
Create
.env
file based on.env.sample
. Fill in Okta client id and domain. -
Start app with:
npm start
- Start redis
redis-server
- Get token from Okta:
curl -X POST -H "Authorization: Basic $(echo -n "_your_clientid:_your_clientsecret" | base64)" https://_your_okta_domain/oauth2/default/v2/token --data "grant_type=client_credentials&scope=customScope"
- Copy the
access_token
from it and export it:
export TOKEN="...access token here..."
- Setup your client rate limit policy:
curl -X POST -H "Authorization: Bearer $TOKEN" http://localhost:3000/clients --data "config.minute=10" --data "config.hour=100"
You should get a JSON response echoing the configuration you just specified, something like this:
[
{
"clientId": "_your_client_id",
"period": "minute",
"value": "10"
},
{
"clientId": "_your_client_id",
"period": "hour",
"value": "100"
}
]
- Check rate limit against a resource key:
curl -H "Authorization: Bearer $TOKEN" http://localhost:3000/clients/_your_client_id/resources/some_arbitrary_key
You should get a JSON response with periods and their status (pass or exceeded). Something of the sort:
{
"periods": [
{
"period": "minute",
"details": {
"status": "exceeded limit",
"limit": 10,
"remaining": 0,
"retryAfter": 39106
}
},
{
"period": "hour",
"details": {
"status": "pass",
"limit": 100,
"remaining": 88,
"retryAfter": 0
}
}
],
"retryAfter": 39106
}
From a client standpoint, one should consider getting the retryAfter
value and pause/sleep/wait for its amount of time in millis (or reject a request for the key used here).
In case of rate-limiting, HTTP status code of 429
(Too Many Requests) will be returned. HTTP headers are also provided in the response (rate-limited or not):
X-RateLimit-Limit-minute: 10
X-RateLimit-Remaining-minute: 0
X-RateLimit-Remaining-minute-Reset: 51624
X-RateLimit-Limit-day: 100
X-RateLimit-Remaining-day: 77
The -Reset
headers are only provided when rate limited.
- EC2: m5.large (): $71.424/month
- Classic Load Balancer: $18.6/month + $0.104 ($0.008*13GB=$0.104) = $18.704/month
- ...
Assuming:
- An average 100 new connections per second for Load Balancer
- Client sends an average of 5 requests per second per connection (over 94M requests per month)
- Each request lasts around 100ms
- 2,000 bytes per request (JWT token is quite big) (about 12.5GB/month)
- 10 rules on the load balancer
- 128MB memory for Lambda
- Using DynamoDB to store rate-limit policies. Policies are written only once or so, so writes are not considered. We use a cache (5 minutes) to avoid too many reads, leading to about 9K reads, which is again marginal.
Estimated costs is <$75/month
Details:
- DynamoDB: <$1/month
- Writes: $0
- Reads: $0
- Storage: $0 (Free tier)
- Application Load Balancer: $40.548/month
- Load balancer: $0.0225/hour = $16.74/month (0.0225 x 24 x 31)
- LCU:
- New connections (per second): 4 LCUs (100 connections per second / 25 connections per second)
- Active connections (per minute): 0.03 LCUs (100 active connections per minute / 3,000 active connections per minute)
- Processed bytes (GBs per hour): 0.034 LCUs (0.3352 GB/1 GB).
- Rule evaluations (per second): 0 LCU
- Total = $0.032/hour (4 LCU x $0.008/hour) = $23.808/month ($0.032 x 24 x 31)
- Lambda: $31.872/month
- Monthly compute charges = $13.1
- Compute (seconds): 94,860,000 requests per month lasting 100ms = 9,486,000 seconds
- Compute (GB-s): 9,486,000 seconds x 128MB/1024 = 1,185,750 GB-s
- Total Compute = 1,185,750 GB-s – 400,000 free tier GB-s = 785,750 GB-s
- Total = 785,750 GB-s x $0.00001667 = $13.1
- Monthly requests charges = $18.772
- 94,860,000 requests – 1M free tier requests = 93,860,000 Monthly billable requests
- Total = 93.860 x $0.2 = $18.772
- Monthly compute charges = $13.1
[1] Rate Limiting using Redis Lists and Sorted Sets, Sahil Jadon, https://medium.com/@sahiljadon/rate-limiting-using-redis-lists-and-sorted-sets-9b42bc192222
[2] System Design — Rate limiter and Data modelling, Sai Sandeep Mopuri, https://medium.com/@saisandeepmopuri/system-design-rate-limiter-and-data-modelling-9304b0d18250
[3] Rate Limiting, Kong, https://docs.konghq.com/hub/kong-inc/rate-limiting/#enabling-the-plugin-on-a-service
[4] Rate Limiting (Enterprise), Kong, https://docs.konghq.com/enterprise/references/rate-limiting/