Automatic creation of tile size selection models using ML methods. Inspired by Yuki et al.'s paper. Adapted for the ROSE compiler. Training and test data sets are generated from PolyBench 3.1.
To automatically model the tile size selection problem, this project contains code that:
- Automatically extracts relevant loop features that are relevant for deciding optimal tile size. Refer to the paper for these features.
- Performs an empirical search for an optimal tile size for each input file.
- Uses each optimal tile size and loop feature pair to train a number of ML models to predict an optimal tile size.
- Uses the trained models in a ROSE pass such that unseen programs can be automatically tiled
GenerateTiledBenchmarks.C:
A ROSE pass that, for each tile candidate loop, extracts features of the loop and outputs a program with that loop tiled to a range of different tile sizes (i.e. {1, 4, 8, 16, 32, 64, 128, 256} by default). Loop features for each test case are appended as a row to a csv filefeatures.csv
generate_all_tiled_benchmarks.sh:
A bash file that callsGenerateTiledBenchmarks.C
on all benchmarks in thebenchmarks/polybench-3.1
directory and stores each output to a directory namedtiled_polybench/
.measure_runtimes.sh:
A bash file that measures the runtime of each tiled polybench program intiled_polybench/
. Stores each result as a row in a csv filetiled_polybench/runtimes.csv
notebooks/tile_size_analysis.ipynb:
A jupyter notebook that reads in thetiled_polybench/runtimes.csv
andtiled_polybench/features.csv
files into dataframes, performs some feature processing, preps data for training, and finally trains a number of scikit-learn classifiers to predict the empirically chosen optimal tile sizes and saves these models into themodels/
directorypredict_tile_size.py:
A python program that takes in loop features as command line arguments and performances inference with the trained models. Outputs the prediction into a specified file.AutoTile.C:
A ROSE pass that for each tile candidate loop, extracts features of the loop, callspredict_tile_size.py
with these features, and finally uses the predicted tile sizes to automatically tile the program
1: Tomofumi Yuki, Lakshminarayanan Renganarayanan, Sanjay Rajopadhye, Charles Anderson, Alexandre E. Eichenberger, and Kevin O'Brien. 2010. Automatic creation of tile size selection models. In Proceedings of the 8th annual IEEE/ACM international symposium on Code generation and optimization (CGO '10). Association for Computing Machinery, New York, NY, USA, 190–199. DOI:https://doi.org/10.1145/1772954.1772982
2: Song Liu, Yuanzhen Cui, Qing Jiang, Qian Wang, Weiguo Wu, An efficient tile size selection model based on machine learning, Journal of Parallel and Distributed Computing, Volume 121, 2018, Pages 27-41, ISSN 0743-7315, https://doi.org/10.1016/j.jpdc.2018.06.005.