We adopt a learning-based method to cache replacement strategy, aiming to improve the miss rate of existing traditional cache replacement policies. The main idea of modeling is to regard the strategy as a MDP so that we can employ DRL to learn how to make decision. We refer to Zhong et al. and design a similar MDP model. The learning backbone, however, is value-based DQN. Our main effort is to use short-term reward to optimize the long term miss rate, and further reducing the model size by clustering the cache pages and using the cluster features, which makes our work pratical and compatible with real-world systems.
To run the code for experiment, you should have the following dependencies package installed at least. We have tested and run through our code with Python 3.6 on MacOSX and Ubuntu 16.04.
numpy
scipy
pandas
tensorflow1.0
The modules are categorized into directory agents
and cache
. The agents
folder contains our implementation of DRL agent and reflex agents, while the cache
folder contains a cache simulator and its affiliated data loader.
-
CacheAgent.py
contains a series of base classes of cache replacement agents. -
ReflexAgent.py
contains our implementation of cache agents of hand-crafted replacement policy, e.g. LRU, Random, etc. -
DQNAgent.py
contains classDQNAgent
, a cache agent with DRL-based replacement strategy.DQNAgent
is based on Deep Q-Network and we employtensorflow
to build the MLPs. -
Cache.py
contains a simulated cache system, acting as the environment for every agent. It not only maintains cache states, but also receives actions from agents and gives feedbacks. Hence, it accepts multiple set of parameters not only to setup the cache system itself, but also to specify the observation features and reward functions. -
DataLoader.py
contains two subclassesDataLoaderPintos
andDataLoaderZipf
.DataLoaderPintos
can load data from our collected or synthetic dataset saved in.csv
format. Refer to our dataset for detailsDataLoaderZipf
can generate access records by mimicking disk activities using Zipf distribution.DataLoaderMix
can generate mixed access requests by peroidically fetching requests (e.g., 10 requests of each) from two trace files.
-
training_data
contains traces which can be used to train the model. It consists of two patterns of trace, one is zipf-related (e.g., 80% requests fall into 20% content), the other features as pure-sequential scan (given a disk address range, randomly pick a starting point, and then randomly pick the length for a scan and so on repeats). The difference of filename between the two patterns is that the second one contains_seq
in the filename. -
testing_data
contains traces which can be used to test our trained model.
- To train/test a pure zipf trace.
./run_training_pure.py training_data/zipf_15k_40k.data
./run_inference_pure.py testing_data/zipf_15k_40k.data
- To train/test a pure sequential-scan trace.
./run_training_pure.py training_data/zipf_15k_40k_seq.data
./run_inference_pure.py testing_data/zipf_15k_40k_seq.data
- To train/test the mixed of above traces.
./run_training_mixed.py training_data/zipf_15k_40k.data training_data/zipf_15k_40k_seq.data
./run_inference_mixed.py testing_data/zipf_15k_40k.data testing_data/zipf_15k_40k_seq.data
- To run classic algorithms over the traces.
./run_classic_algos.py testing_data/zipf_15k_40k.data
./run_classic_algos.py testing_data/zipf_15k_40k_seq.data
./run_classic_algos.py testing_data/zipf_15k_40k.data testing_data/zipf_15k_40k_seq.data
Note you can modify the cache size in the above python scripts.
Partial of the codebase is based on the open source code from Peihao Wang.