redis/redis-benchmarks-specification

[BUG]: running certain tests with --run-count > 1 causes redis server to crash due to out of memory

slice4e opened this issue · 0 comments

running certain tests with --run-count > 1 causes redis server to crash due to out of memory

To reproduce:
Start the redis server:
taskset -c 95 ./src/redis-server --port 6379 --logfile server.log --save ""

Run a test which loads data multiple times:
taskset -c 0,1,2,3 /usr/local/bin/memtier_benchmark --port 6379 --server localhost --json-out-file oss-standalone-2023-01-24-15-30-30-NA-memtier_benchmark-1Mkeys-load-stream-5-fields-with-100B-values-pipeline-10.json "--pipeline" "10" "--data-size" "100" --command "XADD key * field data field data field data field data field data" --command-key-pattern="P" --key-minimum=1 --key-maximum 1000000 --test-time 180 -c 50 -t 4 --hide-histogram --run-count=10

Fails with :
Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/redis_benchmarks_specification/runner/runner.py", line 722, in process_self_contained_coordinator_stream
used_memory_check(
File "/usr/local/lib/python3.8/dist-packages/redis_benchmarks_specification/runner/runner.py", line 915, in used_memory_check
exit(1)
File "/usr/lib/python3.8/_sitebuiltins.py", line 26, in call
raise SystemExit(code)
SystemExit: 1

This error is printed out when we exceed the server memory capacity configuration. However this is not the error that crashes the redis server. But if we run with --run-count=10, then the Redis server itself crashes and that prevents the rest of the tests from completing.

In my testing a single run generated ~ 27 GB of data in memory on CascadeLake and about 34GB of memory on IcerLake as reported by: ./redis-cli info | grep used_memory_human
A few runs will easily generate hundres of GB of data in memory.

Propose:

  1. Benchmark all load heavy tests and adjust the required server memory to match.
  2. If we specify run-count >1, then we need to increase the required server memory by multiplying it to the run count and ensure we have enough.
    ./redis-cli info | grep used_memory_peak_human
    ./redis-cli info | grep total_system_memory_human