Artificial-Intelligence-in-Compiler-Optimization

A list of research papers, datasets, conference and talk and Projects for compilers and program optimization.

Papers
- 2024
- 2023
- 2022
- 2021
- 2020
- 2019
- 2018
- 2017
- 2016
- 2015
- 2014
- 2013
- 2012
- 2011
- 2010
- 2009
- 2008
- 2007
- 2006
- 2005
- 2004
Books
Talks and Tutorials
Compiler Projects
Benchmarks and Datasets
Conferences
Journals
How to Contribute

2024

BaCO: A Fast and Portable Bayesian Compiler Optimization Framework - Erik Hellsten, Artur Souza, Johannes Lenfers, Rubens Lacouture, Olivia Hsu, Adel Ejjeh, Fredrik Kjolstad, Michel Steuwer, Kunle Olukotun, Luigi Nardi. ASPLOS 2024.

2023

Large Language Models for Compiler Optimization - Chris Cummins, Volker Seeker, Dejan Grubisic, Mostafa Elhoushi, Youwei Liang, Baptiste Roziere, Jonas Gehring, Fabian Gloeckle, Kim Hazelwood, Gabriel Synnaeve, Hugh Leather. CC 2023.
(De/Re)-Compositions Expressed Systematically via MDH-Based Schedules - Ari Rasch , Richard Schulze , Denys Shabalin , Anne Elster , Sergei Gorlatch , Mary Hall. CC 2023.
Learning Compiler Pass Orders using Coreset and Normalized Value Prediction - Youwei Liang, Kevin Stone, Ali Shameli, Chris Cummins, Mostafa Elhoushi, Jiadong Guo, Benoit Steiner, Xiaomeng Yang, Pengtao Xie, Hugh Leather, Yuandong Tian. ICML 2023
TLP: A Deep Learning-Based Cost Model for Tensor Program Tuning - Yi Zhai, Yu Zhang, Shuo Liu, Xiaomeng Chu, Jie Peng, Jianmin Ji, Yanyong Zhang, ASPLOS, 2023.
Performance Embeddings: A Similarity-Based Transfer Tuning Approach to Performance Optimization - L Trümper, T Ben-Nun, P Schaad, A Calotoiu, T Hoefler. ICS 2023.
A Game-Based Framework to Compare Program Classifiers and Evaders - Thais Damasio, Michael Canesche, Vinicius Pacheco, Anderson Faustino da Silva, Marcus Botacin and Fernando Magno Quintao Pereira. CGO 2023. Code and Data

2022

SRTuner: Effective Compiler Optimization Customization by Exposing Synergistic Relations - Sunghyun Park, Salar Latifi, Yongjun Park, Armand Behroozi, Byungsoo Jeon, Scott Mahlke. CGO 2022.
Iterative Compilation Optimization Based on Metric Learning and Collaborative Filtering - Hongzhi Liu, Jie Luo, Ying Li, Zhonghai Wu. ACM TACO 2022.
Discovering faster matrix multiplication algorithms with reinforcement learning - Fawzi, Alhussein, Matej Balog, Aja Huang, Thomas Hubert, Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov et al. Nature 2022
Autotuning Convolutions is Easier Than You Think - Nicolas Tollenaere , Guillaume Iooss , Stéphane Pouget , Hugo Brunie , Christophe Guillon , Albert Cohen , P. Sadayappan , Fabrice Rastello. ACM TACO 2022.
Transfer-Tuning: Reusing Auto-Schedules for Efficient Tensor Program Code Generation - Perry Gibson, Jose Cano. PACT 2022.
Glimpse: Mathematical Embedding of Hardware Specification for Neural Compilation - Byung Hoon Ahn, Sean Kinzer, Hadi Esmaeilzadeh. DAC 2022.
One-shot tuner for deep learning compilers - Jaehun Ryu, Eunhyeok Park, Hyojin Sung. CC 2022.
Performance-Detective: Automatic Deduction of Cheap and Accurate Performance Models - Larissa Schmid, Marcin Copik, Alexandru Calotoiu, Dominik Werle, Andreas Reiter, Michael Selzer, Anne Koziolek, Torsten Hoefler, ICS, 2022.
Program Representations for Predictive Compilation: State of Affairs in the Early 20's - Anderson Faustino da Silva, Edson Borin, Fernando Magno Quintao Pereira, Nilton Luiz Queiroz Junior and Otavio Oliveira Napoli. JCL 2022. Code and Data
Improving cross-platform binary analysis using representation learning via graph alignment - Geunwoo Kim, Sanghyun Hong, Michael Franz, Dokyung Song. ISSTA 2022.
BenchPress: A Deep Active Benchmark Generator - Foivos Tsimpourlas, Pavlos Petoumenos, Min Xu, Chris Cummins, Kim Hazelwood, Ajitha Rajan, Hugh Leather. PACT 2022 (code)
Automating Reinforcement Learning Architecture Design for Code Optimization - Huanting Wang, Zhanyong Tang, Cheng Zhang, Jiaqi Zhao, Chris Cummins, Hugh Leather, Zheng Wang. CC 2022 (code)
Composable and Modular Code Generation in MLIR: A Structured and Retargetable Approach to Tensor Compiler Construction - Nicolas Vasilache, Oleksandr Zinenko, Aart J.C. Bik, Mahesh Ravishankar, Thomas Raoux, Alexander Belyaev, Matthias Springer, Tobias Gysi, Diego Caballero, Stephan Herhut, Stella Laurenzo, Albert Cohen. arXiV 2022

2021

The Deep Learning Compiler: A Comprehensive Survey - Mingzhen Li, Yi Liu, Xiaoyan Liu, Qingxiao Sun, Xin You, Hailong Yang, Zhongzhi Luan, Lin Gan, Guangwen Yang, Depei Qian, IEEE Transactions on Parallel and Distributed Systems, 2021
Bayesian Optimization is Superior to Random Search for Machine Learning Hyperparameter Tuning: Analysis of the Black-Box Optimization Challenge 2020 - Ryan Turner, David Eriksson, Michael McCourt, Juha Kiili, Eero Laaksonen, Zhen Xu, Isabelle Guyon. arXiv 2021.
Bliss: auto-tuning complex applications using a pool of diverse lightweight learning models - RB Roy, T Patel, V Gadepally, D Tiwari. PLDI 2021.
Efficient Compiler Autotuning via Bayesian Optimization - Junjie Chen, Ningxin Xu, Peiqi Chen, Hongyu Zhang. ICSE 2021.
Customized Monte Carlo Tree Search for LLVM/Polly's Composable Loop Optimization Transformations - Jaehoon Koo, Prasanna Balaprakash, Michael Kruse, Xingfu Wu, Paul Hovland, Mary Hall. Arxiv.org, 2021.
A Reinforcement Learning Environment for Polyhedral Optimizations - Alexander Brauckmann, Andrés Goens, Jeronimo Castrillon. PACT, 2021.
AI Powered Compiler Techniques for DL Code Optimization - Sanket Tavarageri, Gagandeep Goyal, Sasikanth Avancha, Bharat Kaul, Ramakrishna Upadrasta. Arxiv.org, 2021.
VeGen: A Vectorizer Generator for SIMD and Beyond - Yishen Chen, Charith Mendis, Michael Carbin, Saman Amarasinghe. ASPLOS 2021.
A Flexible Approach to Autotuning Multi-Pass Machine Learning Compilers - Phitchaya Mangpo Phothilimthana, Amit Sabne, Nikhil Sarda, Karthik Srinivasa Murthy, Yanqi Zhou, Christof Angermueller, Mike Burrows, Sudip Roy, Ketan Mandke, Rezsa Farahani, Yu Emma Wang, Berkin Ilbeyi, Blake Hechtman, Bjarke Roune, Shen Wang, Yuanzhong Xu, and Samuel J. Kaufman. PACT 2021.
Value Learning for Throughput Optimization of Deep Neural Workloads - Benoit Steiner, Chris Cummins, Horace He, Hugh Leather. MLSys 2021.
DynaTune: Dynamic Tensor Program Optimization in Deep Neural NetworkCompilation - Minjia Zhang, Menghao Li, Chi Wang, Mingqin Li. ICLR 2021.
Optimizing Memory Placement using Evolutionary Graph Reinforcement Learning - Shauharda Khadka, Estelle Aflalo, Mattias Mardar, Avrech Ben-David, Santiago Miret, Shie Mannor, Tamir Hazan, Hanlin Tang, Somdeb Majumdar. ICLR 2021.
GPTune: Multitask Learning for Autotuning Exascale Applications - Yang Liu, Wissam M. Sid-Lakhdar, Osni Marques, Xinran Zhu, Chang Meng, James W. Demmel, Xiaoye S. Li. PPoPP 2021.
ApproxTuner: A Compiler and Runtime System for Adaptive Approximations - Hashim Sharif, Yifan Zhao, Maria Kotsifakou, Akash Kothari, Ben Schreiber, Elizabeth Wang, Yasmin Sarita, Nathan Zhao, Keyur Joshi, Vikram S. Adve, Sasa Misailovic, Sarita Adve. PPoPP 2021.
Efficient Auto-Tuning of Parallel Programs with Interdependent Tuning Parameters via Auto-Tuning Framework (ATF) - Ari Rasch , Richard Schulze , Michel Steuwer , Sergei Gorlatch. ACM TACO 2021.
Exploring the space of optimization sequences for code-size reduction: insights and tools - Anderson Faustino da Silva, Bernardo N. B. de Lima, and Fernando Magno Quintao Pereira. CC 2021. Code and Data
Using machine learning to predict the code size impact of duplication heuristics in a dynamic compiler - Raphael Mosaner, David Leopoldseder, Lukas Stadler, and Hanspeter Mössenböck. MPLR 2021.
MLGO: a Machine Learning Guided Compiler Optimizations Framework - Mircea Trofin, Yundi Qian, Eugene Brevdo, Zinan Lin, Krzysztof Choromanski, David Li. arXiv. Code
ANGHABENCH: a Suite with One Million Compilable C Benchmarks for Code-Size Reduction - Anderson Faustino da Silva, Bruno Conde Kind, Jose Wesley de Souza Magalhaes, Jeronimo Nunes Rocha, Breno Campos Ferreira Guimaraes, Fernando Magno Quintao Pereira. CGO 2021. Code and Data
Neural Network-based Performance Prediction for Task Migration on S-NUCA Many-Cores - Martin Rapp, Anuj Pathania, Tulika Mitra, Jörg Henkel, IEEE Transactions on Computers, 2021.
A Deep Learning Based Cost Model for Automatic Code Optimization - Riyadh Baghdadi, Massinissa Merouani, Mohamed-Hicham LEGHETTAS, Kamel Abdous, Taha Arbaoui, Karima BENATCHBA, Saman amarasinghe, MLSys 2021
Comparative Code Structure Analysis using Deep Learning for Performance Prediction - Nathan Pinnow, Tarek Ramadan, Tanzima Z. Islam, Chase Phelps, Jayaraman J. Thiagarajan, ISPASS 2021
Extracting Clean Performance Models from Tainted Programs - Marcin Copik, Alexandru Calotoiu, Tobias Grosser, Nicolas Wicki, Felix Wolf, Torsten Hoefler. PPoPP 2021.
Comparative Code Structure Analysis using Deep Learning for Performance Prediction - DNathan Pinnow, Tarek Ramadan, Tanzima Z. Islam, Chase Phelps, Jayaraman J. Thiagarajan. ISPASS 2021.
GraphCodeBERT: Pre-training Code Representations with Data Flow - Daya Guo, Shuo Ren, Shuai Lu, Zhangyin Feng, Duyu Tang, Shujie LIU, Long Zhou, Nan Duan, Alexey Svyatkovskiy, Shengyu Fu, Michele Tufano, Shao Kun Deng, Colin Clement, Dawn Drain, Neel Sundaresan, Jian Yin, Daxin Jiang, Ming Zhou. ICLR 2021.
VESPA: static profiling for binary optimization - Angelica Aparecida Moreira, Guilherme Ottoni, and Fernando Magno Quintao Pereira. OOPSLA 2021. Code and Data
Mapping Computations in Heterogeneous Multicore Systems with Statistical Regression on Program Inputs - Junio Cezar Ribeiro Da Silva, Lorena Leao, Vinicius Petrucci, Abdoulaye Gamatie and Fernando Magno Quintao Pereira. TECS 2021.
Learning Semantic Representations to Verify Hardware Designs - Shobha Vasudevan, Wenjie (Joe) Jiang, David Bieber, Rishabh Singh, hamid shojaei, C. Richard Ho, Charles Sutton. NeurIPS 2021
Deep NLP-based co-evolvement for synthesizing code analysis from natural language - Zifan Nan， Hui Guan，Xipeng Shen, Chunhua Liao. CC 2021

2020

A Taxonomy of ML for Systems Problems - Martin Maas, IEEE Micro, 2020
Improved basic block reordering - Andy Newell and Sergey Pupyrev. IEEE Transactions on Computers, 2020.
Static Neural Compiler Optimization via Deep Reinforcement Learning - Rahim Mammadli, Ali Jannesari, Felix Wolf. LLVM HPC Workshop, 2020.
Autotuning Search Space for Loop Transformations - Michael Kruse, Hal Finkel, Xingfu Wu. LLVM HPC Workshop, 2020.
A Collaborative Filtering Approach for the Automatic Tuning of Compiler Optimisations - Stefano Cereda, Gianluca Palermo, Paolo Cremonesi, and Stefano Doni, LCTES 2020.
Autophase: Compiler phase-ordering for hls with deep reinforcement learning. Ameer Haj-Ali, Qijing Huang, William Moses, John Xiang, Ion Stoica, Krste Asanovic, John Wawrzynek. MLSys 2020.
Deep Learning-based Hybrid Graph-Coloring Algorithm for Register Allocation - Dibyendu Das, Shahid Asghar Ahmad, Kumar Venkataramanan. LLVM HPC Workshop, 2020.
NeuroVectorizer: end-to-end vectorization with deep reinforcement learning - Ameer Haj-Ali, Nesreen K. Ahmed, Ted Willke, Yakun Sophia Shao, Krste Asanovic, and Ion Stoica. CGO 2020.
Chameleon: Adaptive Code Optimization for Expedited Deep Neural Network Compilation - Byung Hoon Ahn, Prannoy Pilligundla, Amir Yazdanbakhsh, Hadi Esmaeilzadeh. ICLR 2020.
Ansor: Generating High-Performance Tensor Programs for Deep Learning - Lianmin Zheng, Chengfan Jia, Minmin Sun, Zhao Wu, Cody Hao Yu, Ameer Haj-Ali, Yida Wang, Jun Yang, Danyang Zhuo, Koushik Sen, Joseph E. Gonzalez, Ion Stoica. OSDI 2020. (slides, presentation)
Achieving High-performance the Functional Way: a Functional Pearl on Expressing High-performance Optimizations as Rewrite Strategies - Bastian Hagedorn, Johannes Lenfers, Thomas K{\oe}hler, Xueying Qin, Sergei Gorlatch, and Michel Steuwer. Proceedings of the ACM on Programming Languages 2020.
PMEvo: Portable Inference of Port Mappings for Out-of-Order Processors by Evolutionary Optimization - Fabian Ritter, Sebastian Hack. PLDI 2020.
An Active Learning Method for Empirical Modeling in Performance Tuning - Jiepeng Zhang, Jingwei Sun, Wenju Zhou, Guangzhong Sun. IPDPS 2020.
CodeBERT:A Pre-Trained Model for Programming and Natural Languages - Zhangyin Feng, Daya Guo, Duyu Tang, Nan Duan, Xiaocheng Feng, Ming Gong, Linjun Shou, Bing Qin, Ting Liu, Daxin Jiang, Ming Zhou. EMNLP 2020.
IR2VEC: LLVM IR Based Scalable Program Embeddings - S. VenkataKeerthy, Rohit Aggarwal, Shalini Jain, Maunendra Sankar Desarkar, Ramakrishna Upadrasta and Y. N. Srikant. TACO 2020.
Deep Program Structure Modeling Through Multi-Relational Graph-based Learning - Guixin Ye, Zhanyong Tang, Huanting Wang, Jianbin Fang, Songfang Huang and Zheng Wang. PACT 2020.
Global Relational Models of Source Code - Vincent J. Hellendoorn, Charles Sutton, Rishabh Singh, Petros Maniatis, David Bieber, ICLR 2020. (Data and Code)
Learning Semantic Program Embeddings with Graph Interval Neural Network - Yu Wang, Ke Wang, Fengjuan Gao, and Linzhang Wang. OOPSLA 2020.
Flow2Vec: Value-Flow-Based Precise Code Embedding - Yulei Sui, Xiao Cheng, Guanqin Zhang and Haoyu Wang. OOPSLA 2020.
MISIM: An End-to-End Neural Code Similarity System - Fangke Ye, Shengtian Zhou, Anand Venkat, Ryan Marcus, Nesime Tatbul, Jesmin Jahan Tithi, Paul Petersen, Timothy Mattson, Tim Kraska, Pradeep Dubey, Vivek Sarkar and Justin Gottschlich . arXiv 2020.
Blended, precise semantic program embeddings - Ke Wang and Zhendong Su. PLDI 2020.
LambdaNet: Probabilistic Type Inference using Graph Neural Networks - Jiayi Wei, Maruth Goyal, Greg Durrett, and Isil Dillig. ICLR 2020.
Compiler-based graph representations for deep learning models of code - Alexander Brauckmann, Andrés Goens, Sebastian Ertel, and Jeronimo Castrillon. CC 2020.

2019

FuncyTuner: Auto-tuning Scientific Applications With Per-loop Compilation - Tao Wang, Nikhil Jain, David Beckingsale, David Böhme, Frank Mueller, Todd Gamblin. ICPP 2019.
Unleashing the Power of Learning: An Enhanced Learning-Based Approach for Dynamic Binary Translation - Changheng Song, Wenwen Wang, Pen-Chung Yew, Antonia Zhai, Weihua Zhang. USENIX ATC 2019.
Compiler Auto-Vectorization with Imitation Learning - Charith Mendis, Cambridge Yang, Yewen Pu, Saman P. Amarasinghe, Michael Carbin. NeurIPS 2019.
Multi-objective Exploration for Practical Optimization Decisions in Binary Translation - Sunghyun Park, Youfeng Wu, Janghaeng Lee, Amir Aupov, and Scott Mahlke. ACM Transactions on Embedded Computing Systems (TECS), 2019.
A Pattern Based Algorithmic Autotuner for Graph Processing on GPUs - Ke Meng, Jiajia Li, Guangming Tan, Ninghui Sun. PPoPP 2019.
FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search - Bichen Wu, Xiaoliang Dai, Peizhao Zhang, Yanghan Wang, Fei Sun, Yiming Wu, Yuandong Tian, Peter Vajda, Yangqing Jia, Kurt Keutzer. CVPR 2019.
TASO: Optimizing Deep Learning Computation with Automatic Generation of Graph Substitutions - Zhihao Jia, Oded Padon, James Thomas, Todd Warszawski, Matei Zaharia, and Alex Aiken. ACM SOSP 2019.
Reinforcement Learning Guided Software Debloating - Nham Le Van, Ashish Gehani, Arie Gurfinkel, Susmit Jha, and Jorge A. Navas. MLSys 2019.
Learning to Optimize Halide with Tree Search and Random Programs - Andrew Adams, Karima Ma, Luke Anderson, Riyadh Baghdadi, Tzu-Mao Li, Michael Gharbi, Benoit Steiner, Steven Johson, Kayvon Fatahalian, Fredo Durand, Jonathan Ragan-Kelley. ACM Trans Graph, 2019.
Ithemal: Accurate, portable and fast basic block throughput estimation using deep neural networks - Charith Mendis, Alex Renda, Saman Amarasinghe, and Michael Carbin. ICML 2019.
Absinthe: Learning an Analytical Performance Model to Fuse and Tile Stencil Codes in One Shot - Tobias Gysi, Tobias Grosser, and Torsten Hoefler. PACT 2019.
Predicting new workload or CPU performance by analyzing public datasets - Yu Wang, Victor Lee, Gu-Yeon Wei, and David Brooks. ACM Transactions on Architecture and Code Optimization (TACO), 2019.
Generative Code Modeling with Graphs - Marc Brockschmidt, Miltos Allamanis, Alexander L. Gaunt, and Oleksandr Polozov. ICLR 2019.
code2seq: Generating sequences from structured representations of code - Uri Alon, Shaked Brody, Omer Levy, and Eran Yahav. ICLR 2019.
code2vec: Learning distributed representations of code - Uri Alon, Meital Zilberstein, Omer Levy, and Eran Yahav. POPL 2019.
COSET: A Benchmark for Evaluating Neural Program Embeddings - Ke Wang, Mihai Christodorescu. arXiv 2019.

2018

Machine Learning in Compiler Optimisation - Zheng Wang and Michael O'Boyle, Proceedings of the IEEE, 2018
A survey on compiler autotuning using machine learning - Amir H. Ashouri, William Killian, John Cavazos, Gianluca Palermo, and Cristina Silvano, ACM Computing Surveys (CSUR), 2018
A survey of machine learning for big code and naturalness - Miltiadis Allamanis, Earl T. Barr, Premkumar Devanbu, and Charles Sutton, ACM Computing Surveys (CSUR), 2018
TVM: An automated end-to-end optimizing compiler for deep learning - Tianqi Chen, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan, Haichen Shen, Meghan Cowan et al., OSDI 2018
Learning to Represent Programs with Graphs - Miltiadis Allamanis, Marc Brockschmidt, and Mahmoud Khademi. ICLR 2018.
Neural Code Comprehension: A Learnable Representation of Code Semantics - Tal Ben-Nun, Alice Shoshana Jakobovits, and Torsten Hoefler. NeurIPS 2018.
Towards Better Understanding of Black-box Auto-tuning: A Comparative Analysis for Storage Systems - Zhen Cao, Vasily Tarasov, Sachin Tiwari, and Erez Zadok. ATC 2018.

2017

Micomp: Mitigating the compiler phase-ordering problem using optimization sub-sequences and machine learning - Amir H. Ashouri, Andrea Bignoli, Gianluca Palermo, Cristina Silvano, Sameer Kulkarni, and John Cavazos. ACM Transactions on Architecture and Code Optimization (TACO) 2017.
Iterative Schedule Optimization for Parallelization in the Polyhedron Model - Stefan Ganser, Armin Grösslinger, Norbert Siegmund, Sven Apel, and Christian Lengauer. ACM Transactions on Architecture and Code Optimization (TACO), 2017.
Learning to superoptimize programs - Rudy Bunel, Alban Desmaison, M. Pawan Kumar, Philip H.S. Torr, Pushmeet Kohlim. ICLR 2017
BOAT: Building auto-tuners with structured Bayesian optimization - Valentin Dalibard, Michael Schaarschmidt, and Eiko Yoneki, WWW 2017.
End-to-end deep learning of optimization heuristics - Chris Cummins, Pavlos Petoumenos, Zheng Wang, and Hugh Leather (slides). PACT 2017.
Semantic-aware program sampling - Pratiksha Thaker, Daniel Tarlow, and Marc Brockschmidt. NeurIPS 2017.
DeepCoder: Learning to write programs - Matej Balog, Alexander L. Gaunt, Marc Brockschmidt, Sebastian Nowozin, and Daniel Tarlow. ICLR 2017.
Synthesizing Benchmarks for Predictive Modeling - Chris Cummins, Pavlos Petoumenos, Zheng Wang, and Hugh Leather (slides). CGO 2017.
Minimizing the cost of iterative compilation with active learning - William Ogilvie, Pavlos Petoumenos, Zheng Wang, and Hugh Leather. CGO 2017.

2016

Cobayn: Compiler autotuning framework using bayesian networks - Amir H. Ashouri, Giovanni Mariani, Gianluca Palermo, Eunjung Park, John Cavazos, and Cristina Silvano, ACM Transactions on Architecture and Code Optimization (TACO), 2016.
Convolutional neural networks over tree structures for programming language processing - Lili Mou, Ge Li, Lu Zhang, Tao Wang, and Zhi Jin. AAAI 2016.
A Convolutional Attention Network for Extreme Summarization of Source Code - Miltos Allamanis, Hao Peng, and Charles Sutton. ICML 2016.

2015

Autotuning algorithmic choice for input sensitivity - Yufei Ding, Jason Ansel, Kalyan Veeramachaneni, Xipeng Shen, Una-May O'Reilly, and Saman Amarasinghe. PLDI 2015
Fast: A fast stencil autotuning framework based on an optimal-solution space model - Yulong Luo, Guangming Tan, Zeyao Mo, and Ninghui Sun. ACM Transactions on Architecture and Code Optimization (TACO), 2015.
GPU performance and power tuning using regression trees - Wenhao Jia, Elba Garza, Kelly A. Shaw, and Margaret Martonosi. SC 2015.

2014

Reinforcement learning-based inter-and intra-application thermal optimization for lifetime improvement of multicore systems - Anup K Das, Rishad Ahmed Shafik, Geoff V Merrett, Bashir M Al-Hashimi, Akash Kumar, Bharadwaj Veeravalli. DAC 2014
Opentuner: An extensible framework for program autotuning - Jason Ansel, Shoaib Kamil, Kalyan Veeramachaneni, Jonathan Ragan-Kelley, Jeffrey Bosboom, Una-May O'Reilly, and Saman Amarasinghe. PACT 2014

2013

Continuous learning of compiler heuristics - Michele Tartara and Stefano Crespi Reghizzi. ACM Transactions on Architecture and Code Optimization (TACO), 2013.
Automatic construction of inlining heuristics using machine learning. - Sameer Kulkarni, John Cavazos, Christian Wimmer, and Douglas Simon. CGO 2013.
Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines - Jonathan Ragan-Kelley, Connelly Barnes, Andrew Adams, Sylvain Paris, Frédo Durand, and Saman Amarasinghe, PLDI 2013.

2012

Mitigating the compiler optimization phase-ordering problem using machine learning - Sameer Kulkarni and John Cavazos. OOPSLA 2012.

2011

An evaluation of different modeling techniques for iterative compilation - Eunjung Park, Sameer Kulkarni, and John Cavazos. CASES 2011.

2010

Evaluating iterative optimization across 1000 datasets - Yang Chen, Yuanjie Huang, Lieven Eeckhout, Grigori Fursin, Liang Peng, Olivier Temam, and Chengyong Wu. PLDI 2010
Automatic creation of tile size selection models - Tomofumi Yuki, Lakshminarayanan Renganarayanan, Sanjay Rajopadhye, Charles Anderson, Alexandre E. Eichenberger, and Kevin O'Brien. CGO 2010.

2008

Iterative optimization in the polyhedral model: Part II, multidimensional time - Louis-Noël Pouchet, Cédric Bastoul, Albert Cohen, and John Cavazos. PLDI 2008.
Cole: compiler optimization level exploration - Kenneth Hoste and Lieven Eeckhout. CGO 2008.
MILEPOST GCC: machine learning based research compiler - Grigori Fursin, Cupertino Miranda, Olivier Temam, Mircea Namolaru, Elad Yom-Tov, Ayal Zaks, Bilha Mendelson et al., 2008

2007

Evaluating heuristic optimization phase order search algorithms - J. W. Davidson, Gary S. Tyson, D. B. Whalley, and P. A. Kulkarni. CGO 2007.
Rapidly selecting good compiler optimizations using performance counters - John Cavazos, Grigori Fursin, Felix Agakov, Edwin Bonilla, Michael FP O'Boyle, and Olivier Temam. CGO 2007.

2006

Using machine learning to focus iterative optimization - Felix Agakov, Edwin Bonilla, John Cavazos, Björn Franke, Grigori Fursin, Michael FP O'Boyle, John Thomson, Marc Toussaint, and Christopher KI Williams. CGO 2006.

2005

Method-specific dynamic compilation using logistic regression - John Cavazos and Michael FP O'boyle. OOPSLA 2005.
Predicting unroll factors using supervised classification - Mark Stephenson and Saman Amarasinghe. CGO 2005.

2004

Fast searches for effective optimization phase sequences - Prasad Kulkarni, Stephen Hines, Jason Hiser, David Whalley, Jack Davidson, and Douglas Jones. PLDI 2004.

Books

Automatic Tuning of Compilers Using Machine Learning - Amir H. Ashouri, Gianluca Palermo, John Cavazos, and Cristina Silvano. Springer 2018.
Software Automatic Tuning - From Concepts to State-of-the-Art Results - K Naono, K Teranishi, J Cavazos, and R Suda. Springer 2010.

Talks and Tutorials

Saman Amarasinghe, Compiler 2.0: Using Machine Learning to Modernize Compiler Technology. LCTES 2020.
Using Machine Learning to Modernize Compiler Technology, ACM SIGPLAN
Amir Ashouri, Compiler Autotuning using Machine Learning: A State-of-the-art Review(slides). Polytechnic University of Milan 2018.
Understanding Compiler Optimization, DevConfCZ 2019
CompilerGym: Robust, Performant Compiler Optimization Environments for AI, CGO Conference 2022
Jinliang Wei, Google AI Quorum: Scaling Deep Learning Training With Compiler Optimizations, the AI quorum, 2022

Software

CompilerGym - About Reinforcement learning environments for compiler and program optimization tasks (paper).
CodeBert - pre-trained DNN models for programming languages provided by Microsoft. It also has impletelmentation of many compiler papers like: CodeBER,GraphCodeBERT, UniXcoder,CodeReviewer, CodeExecutor, LongCoder (paper).
NeuroVectorizer - Using deep reinforcement learning to predict optimal vectorization compiler pragmas by Intel (paper).
programl - ProGraML is a representation for programs as input to a machine learning model (paper).
ONNX-MLIR - Representation and Reference Lowering of ONNX Models in MLIR Compiler Infrastructure (paper).
IREE - A retargetable MLIR-based machine learning compiler and runtime toolkit.
Supersonic - Automating reinforcement learning architecture design for code optimization. (paper).
clgen - Deep learning program generator using LSTM (paper; slides).
OpenTuner - An extensible framework for program autotuning (paper; slides)

Benchmarks and Datasets

CPU® 2017 Benchmark Suite - SPEC CPU2017 Benchmark Suite.
CodeNet - CodeNet is to provide the AI-for-Code research community with a large scale, diverse, and high quality curated dataset to drive innovation in AI technique (acceptance/error types)
CodeXGLUE - Dataset for Code Understanding and Generation by Microsoft (paper)
BHive - A Benchmark Suite and Measurement Framework for Validating x86-64 Basic Block Performance Models (paper).
PolyBench - PolyBench is a collection of benchmarks containing static control parts. The purpose is to uniformize the execution and monitoring of kernels. (paper).
devmap -End-to-end Deep Learning of Optimization Heuristics (paper; slides).
TenSet: A Large-scale Program Performance Dataset for Learned Tensor Compilers - (paper).