This repository holds two easy to use scripts, that are used to generate traces of requests from different formats into a consistent format which includes access times for each request in the original trace.
These traces are taken from wikibench.
- gradle - key is hex number on first field in line with fields seprated by spaces
- address - address is hex number on second field in line with fields seprated by spaces
- arc - address is number on first field and size in bytes is second field in line with fields seprated by spaces, the two fields are used to create blocks starting at diffrent addresses, this is the format used by trace in the ARC paper
- oltp - key is number on first field in line with fields seprated by commas
- lirs - key is number and is the only data in the line
This is a partial list, more can be found online at public traces repository:
- gradle - available at caffeine repo
- address- available at caffeine repo
- oltp - available at UMassTraceRepository
- lirs - available at caffeine repo
Before running the script you need to place the input traces from wikibench in the input directory.
Then you need to update the wiki-trace-maker.py
file for your wanted paramters.
TODO: add command line arguments to control parameters
The required changes are in the following lines
biases = [[0.8,0.15,0.05]]
ranges = [(10,31),(120,181),(350,451)]]
for wikifile in ['wiki1190207720']:
...
- In the wiki files list enter the strings of the file names you downloaded and placed in the
input
directory. - In the
biases
list put the ratio of each range of times from the ranges found in theranges
list at the same index.
For example in the code above:
- 80% of the requests will be fetched in times in the range 10ms - 30ms
- 15% of the requests will be fetched in times in the range 120ms - 180ms
- 5% of the requests will be fetched in times in the range 350ms - 450ms
python3 wiki-trace-maker.py
Run:
python3 paper-storage-script.py
To generate GCC
example trace with access times based on SSD and HDD.
You can change the input files, formats and times distributions easily in the file paper-storage-script.py
.
In order to add new types of Drives to be used you may edit the dictionary found at the top of the address-trace-maker.py
file.