RedisLabs/memtier_benchmark

Latency decreases as throughput increases under rate-limiting

Avidanborisov opened this issue · 2 comments

Using the recently introduced --rate-limit option, I've ran a simple benchmark where the rate limit doubles each time, to see the relation between the latency and the throughput as the sustained throughput increases.

Here are the results, as well as the command to reproduce (the server is Redis)

$ for i in 1000 2000 4000 8000 16000 32000 64000 128000; do memtier_benchmark -h 10.20.1.4 --hide-histogram --test-time 30 --threads 1 --clients 50 --rate-limit $((i/50
)); done 2>/dev/null | grep -E "(Type|Totals)"
Type         Ops/sec     Hits/sec   Misses/sec    Avg. Latency     p50 Latency     p99 Latency   p99.9 Latency       KB/sec
Totals       1001.44         0.00       909.79         0.97415         0.75900         8.19100        10.04700        42.50
Type         Ops/sec     Hits/sec   Misses/sec    Avg. Latency     p50 Latency     p99 Latency   p99.9 Latency       KB/sec
Totals       2001.06         0.00      1817.78         0.81948         0.71100         6.23900         9.72700        84.94
Type         Ops/sec     Hits/sec   Misses/sec    Avg. Latency     p50 Latency     p99 Latency   p99.9 Latency       KB/sec
Totals       4000.67         0.00      3635.77         0.77258         0.62300         6.27100         9.79100       169.73
Type         Ops/sec     Hits/sec   Misses/sec    Avg. Latency     p50 Latency     p99 Latency   p99.9 Latency       KB/sec
Totals       7999.25         0.00      7271.14         0.69692         0.63100         2.62300         9.40700       339.32
Type         Ops/sec     Hits/sec   Misses/sec    Avg. Latency     p50 Latency     p99 Latency   p99.9 Latency       KB/sec
Totals      15747.45         0.00     14314.50         0.66834         0.63900         1.31100         8.70300       667.98
Type         Ops/sec     Hits/sec   Misses/sec    Avg. Latency     p50 Latency     p99 Latency   p99.9 Latency       KB/sec
Totals      31851.22         1.67     28953.36         0.65727         0.64700         1.19100         7.03900      1350.96
Type         Ops/sec     Hits/sec   Misses/sec    Avg. Latency     p50 Latency     p99 Latency   p99.9 Latency       KB/sec
Totals      63371.75         6.67     57603.23         0.64743         0.65500         1.15900         4.22300      2688.16
Type         Ops/sec     Hits/sec   Misses/sec    Avg. Latency     p50 Latency     p99 Latency   p99.9 Latency       KB/sec
Totals      78407.32        24.70     71254.00         0.63731         0.63900         1.07900         4.25500      3326.53

The results are surprising - all the latency metrics seems to decrease, as the throughput increases. My expectation from similar benchmarks is to see the latency gradually increase as the sustained throughput increases, to the point it explodes when a bottleneck is reached. Is there an issue with the latency calculation or is this expected?

Thanks!

@Avidanborisov I've tested your script, while doing some changes to ensure that the conditions on each iteration are equal, namely:

  • ensure clean DB at start of each stage
  • ensure we use a small key range to ensure that the ammount of commands issued does not affect the performance (meaning longer/more ops/sec runs don't impact the key range)
  • ensure we don't benchmark invalid/faster commands (like the misses/sec you have above) by simply using a write command
  • also checking the internal DB latency

script:

#!/bin/bash
HOST=192.168.1.200
PORT=6379
C=50
A="perf"
for rate in 1000 2000 4000 8000 16000 32000 64000 128000; do
    echo "--------------------------------------------------"
    echo "running $rate"
    redis-cli --no-auth-warning -a $A -h $HOST flushall >/dev/null
    redis-cli --no-auth-warning -a $A -h $HOST config resetstat >/dev/null
    memtier_benchmark --test-time 60 -s $HOST -p $PORT -c $C -t 1 --rate-limiting  $(($rate/$C)) -a $A --json-out-file $rate.json --key-maximum 1 --key-minimum 1 --ratio 1:0 --hide-histogram 2>/dev/null | grep -E "(Type|Totals)"
    redis-cli --no-auth-warning -a $A -h $HOST info commandstats | grep "set:" | awk '{split($0,a,","); print a[1],a[3]}'
    echo "--------------------------------------------------"
    sleep 15
done

After running the above on 2 physical nodes with Static High Performance Power Mode—Processors run in the maximum power and performance state, regardless of the OS power management policy, we get the following results:

Ops/sec Average internal command latency (ms) Average client latency including RTT (ms) p50 latency including RTT (ms) p99 latency including RTT (ms) p999 latency including RTT (ms)
999.94 0.00033 0.21428 0.199 1.271 2.239
1967.02 0.00034 0.18509 0.175 0.359 2.079
3999.71 0.00034 0.17738 0.175 0.287 1.831
7999.07 0.00035 0.17585 0.175 0.287 1.495
16270.22 0.00035 0.17376 0.175 0.271 0.583
31996.51 0.00034 0.17849 0.175 0.287 0.407
56753.64 0.00034 0.17604 0.175 0.279 0.391
57265.4 0.00034 0.17452 0.175 0.271 0.383

and plotting them:
image

As confirmed above, I deeply suspect that your system has some kind of power governor that is scaling the frequency of your system. For example, If I enable power saving on the nodes I immediately get different results and only best performance at the end.
image

I suggest you check this on your end, and if you can't control the frequency, run the benchmark with the order of the rate in reverse (start high rate, and end low rate, and you should see best results at the end).
For now I'm closing this issue given there is no evidence of issues in the code of memtier or redis :)

@filipecosta90 Thanks for the detailed and clear response. I'll be sure to take a look at your suggestions for more accurate benchmarking of this phenomenon.

However, as far as I can tell from your attached data, the overall throughput-latency relationship pattern looks similar to mine and is still unclear to me. My general understanding is that the relation between the latency, offered load and sustained load should look like this [1]:

image
image

For instance, benchmarking Nginx on my system with wrk2, yields the following (similar to above) curves:

image

Whereas Redis (with memtier_benchmark) yields:

image

The plot you attached looks similar to mine - the general direction of the latency curve as the sustained rate increases is downwards instead of upwards, and it never "explodes" when the sustained rate is far less than the requested rate.

I wish to understand the reason for this. In particular, why is the latency worse at very low rates compared to high rates, and why it is it not going upwards when the sustained rate is maximal.

I'm far from an expert on the subject, but I have a feeling that the latency calculation is not taking coordinated omission into consideration. My script (and your script) tested the system with a single thread, however I couldn't reach the expected theoretical behavior with multiple threads as well (also, with wrk2 the calculation works as expected with a single thread as well). If there's an alternative set of command line configurations which yields the expected theoretical behavior, that's OK too.

Thanks again!

[1]: https://perfdynamics.blogspot.com/2010/03/bandwidth-vs-latency-world-is-curved.html