/Rocks-tracker

An example of RocksDB monitor

Primary LanguagePython

ADOC-tracker

What's this?

This is a profiler framework used in paper "ADOC: Automatically Harmonizing Dataflow Between Components in Log-Structured Key-Value Stores for Improved Performance"

What's for?

You can use this framework to track following real-time metrics:

Metrics Measure Tool content Result File
IO information iostat CPU utilization (IO process only) and bandwidth for each second text, named with IOSTAT.txt
process information pidstat cpu utilization (sys,usr,total), disk info (bytes read/written in), stat_result.csv
real-time throughput db_bench time elapsed and throughput report.csv
output of db_bench N/A workload information, performance summary, level states, etc. stdout.txt
*perf output perf execution trace perf.out

* The usage of perf is listed in the example directory, but you need to config it by your self, it won't be embedded automatically, for both performance and size consideration.

There are also some other thing you can use, for example:

  1. We embedded the cgroup tool in the system, you can
    1. limit the CPU clock number to control the vCPU used by db_bench (set in default.ini)
    2. limit the bandwidth by bytes_wrote_in, check that in the example "bandwidth_influence"
    3. do further modification to make full use of cgroup tool
  2. You can use the db_bench_dynamic_runner to simulate the scenarios:
    1. Your db_bench is running with another software with higher throughput (The throughput is generated by Alibaba's workload trace, but only the first one hour of machine No. 48)
  3. If you are really interested the impact of each parameter, try the parameter_influence example, we will upload the ANOVA test script later, so that you can use ONE-WAY ANOVA to analyze the impact of different parameters, and pick the most important ones. This function is inspired from the paper Rafiki

Warning!!!

The result files can be very large, use the command sudo gzip **/LOG*,sudo gzip **/iostat* to compress the oversized files.

Preparation

  1. Download RocksDB, and compile the db_bench
    1. Modify the default.ini, and set the db_bench path
    2. You can always reload the path with in the running script
  2. This framework was designed for evaluating the impact of thread number and batch size (common size of Memtable and SSTable), but you can always change the configure in the config.json
  3. You will need several python packages, and following system tools:
    1. iostat
    2. pidstat
    3. top
    4. perf
    5. cgroup
  4. Please download the flame graph tool in this link if you want to plot the flamegraph
  5. If you are interested, you can visit the plot script in this link

After you have installed all the packages, create a directory, create a DB_launcher class to run your experiments. Refer the following examples to see further details.

What's in the dirs?

dir name usage
bandwidth_influence use cgroup to limit the available bandwidth
parameter_influence traverse through all options, and use ANOVA method to evaluate the impact of each parameter
rate-limited-fillrandom run the fillrandom workload with a rate-limiter in db_bench
fillrandom the basic usage, run fillrandom and monitor the resource usage
white_noise_fillrandom run fillrandom with varying bandwidth, the bandwidth follows a sine function
on_cpu_analysis run fillrandom and save the perf results