This is an instruction to benchmark AirIndex Manual and AirIndex (auto-tuned index) for experiments in AirIndex: Versatile Index Tuning Through Data and Storage.
Please follow dataset and query key set instructions to setup the benchmarking environment. These are examples of environment reset scripts. The following assumes that the dataset are under /path/to/data/
and key sets are under /path/to/keyset/
.
cargo build --release
Optionally, you can run the unit tests to check compatibility.
cargo test
For each storage (e.g., NFS) you would like benchmark on, tune and build indexes for all datasets.
bash scripts/sosd_experiment.sh file:///path/to/data file:///path/to/keyset file:///path/to/manual btree btree build 1 ~/reload_nfs.sh nfs
bash scripts/sosd_experiment.sh file:///path/to/data file:///path/to/keyset file:///path/to/airindex_nfs enb step,band_greedy,band_equal build 1 ~/reload_nfs.sh nfs
Afterwards, benchmark over 40 key set of 1M keys.
bash scripts/sosd_experiment.sh file:///path/to/data file:///path/to/keyset file:///path/to/manual btree btree benchmark 40 ~/reload_nfs.sh nfs
bash scripts/sosd_experiment.sh file:///path/to/data file:///path/to/keyset file:///path/to/airindex_nfs enb step,band_greedy,band_equal benchmark 40 ~/reload_nfs.sh nfs
The measurements will be recorded in sosd_benchmark_out.jsons
.
Inspect a breakdown of the latency from existing built indexes by following commands.
bash scripts/sosd_experiment.sh file:///path/to/data file:///path/to/keyset file:///path/to/manual btree btree breakdown 40 ~/reload_nfs.sh nfs
bash scripts/sosd_experiment.sh file:///path/to/data file:///path/to/keyset file:///path/to/airindex_nfs enb step,band_greedy,band_equal breakdown 40 ~/reload_nfs.sh nfs
The measurements will be recorded in sosd_breakdown_out.jsons
.
Generate skewed Zipfian keysets by following the instruction.
Then use the benchmark script by pointing to the skewed keysets.
bash scripts/sosd_experiment.sh file:///path/to/data file:///path/to/keyset/skew file:///path/to/airindex_nfs enb step,band_greedy,band_equal benchmark 40 ~/reload_nfs.sh nfs
Similarly to 5.2, build the AirIndex variants.
bash scripts/sosd_variants.sh file:///path/to/data file:///path/to/keyset file:///path/to/airindex_variants_index build 1 ~/reload_nfs.sh nfs
Then, benchmark all of them
bash scripts/sosd_variants.sh file:///path/to/data file:///path/to/keyset file:///path/to/airindex_variants_index benchmark 40 ~/reload_nfs.sh nfs
Let AirIndex tune indexes over a variety of affine storage profiles. Highly recommend executing on a CPU-rich machine; otherwise, this will take a considerable time.
bash scripts/storage_explore.sh file:///path/to/data file:///path/to/keyset file:///path/to/storage_explore enb
Then, read the index structures.
bash scripts/inspect.sh file:///path/to/data file:///path/to/keyset file:///path/to/storage_explore enb
To measure the build time, run the build script.
bash scripts/scale.sh file:///path/to/data file:///path/to/keyset file:///path/to/airindex_scalability enb scalability.jsons
Build indexes with varying hyperparameter k by using a different action buildtopk
.
bash scripts/sosd_experiment.sh file:///path/to/data file:///path/to/keyset file:///path/to/airindex_nfs enb step,band_greedy,band_equal buildtopk 1 ~/reload_nfs.sh nfs
To execute Data Calculator's auto-completion.
bash scripts/data_calculator_sosd.sh file:///path/to/data file:///path/to/keyset file:///path/to/data_calc autocomplete 1 ~/reload_nfs.sh nfs
Then copy the suggested structure at the end (load and number of layers) to insert to scripts/data_calculator_sosd.sh
(lines 51-69).
Build and benchmark similarly to AirIndex
bash scripts/data_calculator_sosd.sh file:///path/to/data file:///path/to/keyset file:///path/to/data_calc build 1 ~/reload_nfs.sh nfs
bash scripts/data_calculator_sosd.sh file:///path/to/data file:///path/to/keyset file:///path/to/data_calc benchmark 40 ~/reload_nfs.sh nfs
To benchmark on skewed workload (6.3), generate skewed keysets and change the keyset path accordingly.