Rate Limiter: Limit by Percent Time
numeralnathan opened this issue · 6 comments
Please enhance Rate Limiter so that the amount of time spent waiting for the operation is less than a configurable percent.
Let's say an operation takes 10 seconds and the percent is 10%. Then, the rate limiter wouldn't allow another operation to proceed for another 100 seconds so that the overall time spent in the operation is ≤ 10%.
If the operation is a HTTP request, then the above rate limiter can dynamically respond to server performance. If the server is responding quickly, then the rate limiter would have little impact on the request rate. If the server is responding slowly, then the rate limiter would slow down the request rate.
This is very important when an operation needs to have less than 1% impact.
This sounds like a variation of the max wait time setting, where we'd adjust the max wait time to be some percentage of the last (N?) executions time(s) through the rate limiter? Currently, a rate limiter doesn't measure execution times exactly (there's no explicit "releasePermit" method). It just tracks the rate of execution requests (permits) but doesn't care when they complete. Are you suggesting we should track actual execution times and base the current max wait time off of some recent execution time(s)?
@jhalterman I am suggesting the rate limiter track the recent execution times from multiple operations. Before an operation can proceed, calculate the following:
average execution time = total execution time ÷ number of operations
total execution time += average execution time
execution delay = wall clock time * configured percent - total execution time
Note: The average execution time is kept in a local variable for later use.
The rate limiter then waits the above calculated execution delay and measures the operation's execution time. The following adjustment to the total execution time is made.
total execution time += operation's execution time - average execution time
Note: The average execution time is the value kept in a local variable above.
The total execution time is modified before the operation executes so that subsequent operations will potentially wait longer before executing.
One challenge is deciding where the best place to store historic execution information might be, and also, how many execution times to store. Since this might be a bit use case specific, another option would be to accept a maxWait predicate on a RateLimiter, where a user can decide, based on whatever criteria, whether to permit an execution or not. This implies having access to some stats about execution times though.
@jhalterman You can store historic execution information in ImpactRateLimiterImpl
, a new class. You could have private volatile Duration totalTime
and private volatile long count
fields. A VarHandle.compareAndSet()
can be used to add to the totalTime
.
Before the operation's execution, ImpactRateExecutor
adds the average operation execution time to "reserve" its operation execution start time. This determines how long the execution must wait before starting (or not wait at all). After the operation's execution, ImpactRateExecutor
adds the actual operation execution time minus the average used in the first sentence. The first execution does not add anything before its execution.
Executions will do one of the following:
- Proceed immediately because the total execution time does not exceed the maximum impact.
- Wait for a short amount of time because the execution's average time will cause the impact to exceed the maximum impact.
- Throw an exception because the wait time exceeds the maximum wait time.
VarHandle
is Java 9+, and I would very much like to maintain Java 8 compatibility.
@Tembrel Ah, you're right. So, you can use AtomicReference<Duration>
and AtomicLong
.