Source code for benchmarking the RediSearch module, providing scalable high performance full-text search.
This is a Go application that can ingest data into the search engine, and benchmark the throughput and latency of running these benchmarks.
It supports reading Wikipedia Abstract Data Dumps and indexing them, in three search engines:
Output of the benchmark, running on a full Abstracts dump of the English wikipedia, on 5 redis shards over a c4.4xlarge
EC2 instance:
Benchmark | Concurrent Clients | Throughput (requests/sec) | Average Latency (ms) |
---|---|---|---|
search: hello | 1 | 2737.57 | 0.36 |
search: hello | 8 | 9706.18 | 0.80 |
search: hello | 16 | 11201.99 | 1.39 |
search: hello | 32 | 13198.80 | 2.36 |
search: hello | 64 | 17663.32 | 3.51 |
search: barack obama | 64 | 16482.96 | 3.77 |
search: barack obama | 1 | 2505.68 | 0.40 |
search: barack obama | 8 | 8522.50 | 0.91 |
search: barack obama | 16 | 10346.83 | 1.50 |
search: barack obama | 32 | 11720.58 | 2.66 |
exact: "united states of america" | 1 | 618.86 | 1.62 |
exact: "united states of america" | 8 | 816.22 | 9.38 |
exact: "united states of america" | 16 | 816.47 | 18.97 |
exact: "united states of america" | 32 | 815.41 | 37.44 |
exact: "united states of america" | 64 | 801.75 | 75.61 |
search: manchester united | 1 | 1513.97 | 0.66 |
search: manchester united | 4 | 3192.11 | 1.23 |
search: manchester united | 16 | 3485.66 | 4.43 |
search: manchester united | 32 | 3512.08 | 8.80 |
search: manchester united | 64 | 3559.03 | 17.29 |
autocomplete (2-3 letters) | 1 | 4145.90 | 0.24 |
autocomplete (2-3 letters) | 4 | 9691.04 | 0.41 |
autocomplete (2-3 letters) | 8 | 12129.34 | 0.64 |
autocomplete (2-3 letters) | 16 | 15268.47 | 1.00 |
autocomplete (2-3 letters) | 32 | 16064.66 | 1.90 |
autocomplete (2-3 letters) | 64 | 17255.77 | 3.51 |
autocomplete (2-3 letters) | 128 | 17935.49 | 6.47 |
For each benchmark, we append a single line to a CSV file, with the engine used, benchmark type, query, concurrency, throughput and latency.
The default file name is benchmark.csv
, and running the app with -o -
will result in the result printed to stdout.
The output for running a benchmark on the queries "foo,bar,baz" with 4 concurrent clients, looks like this:
redis,"search: foo,bar,baz",4,14997.81,0.27
Usage of ./RediSearchBenchmark:
-benchmark string
[search|suggest] - if set, we run the given benchmark
-c int
benchmark concurrency (default 4)
-duration int
number of seconds to run the benchmark (default 5)
-engine string
[redis|elastic|solr] The search backend to run (default "redis")
-file string
Input file to ingest data from (wikipedia abstracts)
-fuzzy
For redis only - benchmark fuzzy auto suggest
-hosts string
comma separated list of host:port to redis nodes (default "localhost:6379")
-o string
results output file. set to - for stdout (default "benchmark.csv")
-queries string
comma separated list of queries to benchmark (default "hello world")
-scores string
read scores of documents CSV for indexing
-shards int
the number of partitions we want (AT LEAST the number of cluster shards) (default 1)
./RediSearchBenchmark -engine redis -shards 4 -hosts "localhost:6379,localhost:6380,localhost:6381,localhost:6382" \
-file ~/wiki/enwiki-20160305-abstract.xml -scores ~/wiki/scores.csv
./RediSearchBenchmark -engine redis -shards 4 -hosts "localhost:6379,localhost:6380,localhost:6381,localhost:6382" \
-benchmark search -queries "hello world,foo bar" -c 32 -o out.csv