A list of awesome compiler projects and papers for tensor computation and deep learning.
- TVM: An End to End Deep Learning Compiler Stack
- Halide: A Language for Fast, Portable Computation on Images and Tensors
- TensorComprehensions: Framework-Agnostic High-Performance Machine Learning Abstractions
- Tiramisu: A Polyhedral Compiler for Expressing Fast and Portable Code
- XLA: Optimizing Compiler for Machine Learning
- MLIR: Multi-Level Intermediate Representation
- Hummingbird: Compiling Trained ML Models into Tensor Computation
- nnfusion: A Flexible and Efficient Deep Neural Network Compiler
- nGraph: An Open Source C++ library, compiler and runtime for Deep Learning
- PlaidML: A Platform for Making Deep Learning Work Everywhere
- Glow: Compiler for Neural Network Hardware Accelerators
- TACO: The Tensor Algebra Compiler
- TASO: The Tensor Algebra SuperOptimizer for Deep Learning
- Triton: An Intermediate Language and Compiler for Tiled Neural Network Computations
- DLVM: Modern Compiler Infrastructure for Deep Learning Systems
- NN-512: A compiler that generates C99 code for neural net inference
- The Deep Learning Compiler: A Comprehensive Survey by Mingzhen Li et al., TPDS 2020
- An In-depth Comparison of Compilers for DeepNeural Networks on Hardware by Yu Xing et al., ICESS 2019
- DeepCuts: A deep learning optimization framework for versatile GPU workloads by Wookeun Jung et al., PLDI 2021
- PET: Optimizing Tensor Programs with Partially Equivalent Transformations and Automated Corrections by Haojie Wang et al., OSDI 2021
- MLIR: Scaling Compiler Infrastructure for Domain Specific Computation by Chris Lattner et al., CGO 2021
- A Tensor Compiler for Unified Machine Learning Prediction Serving by Supun Nakandala et al., OSDI 2020
- Rammer: Enabling Holistic Deep Learning Compiler Optimizations with rTasks by Lingxiao Ma et al., OSDI 2020
- MLIR: A Compiler Infrastructure for the End of Moore's Law by Chris Lattner et al., arXiv 2020
- TASO: The Tensor Algebra SuperOptimizer for Deep Learning by Zhihao Jia et al., SOSP 2019
- Tiramisu: A polyhedral compiler for expressing fast and portable code by Riyadh Baghdadi et al., CGO 2019
- Triton: an intermediate language and compiler for tiled neural network computations by Philippe Tillet et al., MAPL 2019
- Relay: A High-Level Compiler for Deep Learning by Jared Roesch et al., arXiv 2019
- TVM: An Automated End-to-End Optimizing Compiler for Deep Learning by Tianqi Chen et al., OSDI 2018
- Tensor Comprehensions: Framework-Agnostic High-Performance Machine Learning Abstractions by Nicolas Vasilache et al., arXiv 2018
- Intel nGraph: An Intermediate Representation, Compiler, and Executor for Deep Learning by Scott Cyphers et al., arXiv 2018
- Glow: Graph Lowering Compiler Techniques for Neural Networks by Nadav Rotem et al., arXiv 2018
- DLVM: A modern compiler infrastructure for deep learning systems by Richard Wei et al., arXiv 2018
- Diesel: DSL for linear algebra and neural net computations on GPUs by Venmugil Elango et al., MAPL 2018
- The Tensor Algebra Compiler by Fredrik Kjolstad et al., OOPSLA 2017
- Halide: A Language and Compiler for Optimizing Parallelism, Locality, and Recomputation in Image Processing Pipelines by Jonathan Ragan-Kelley et al., PLDI 2013
- Value Learning for Throughput Optimization of Deep Neural Networks by Benoit Steiner et al., MLSys 2021
- Ansor: Generating High-Performance Tensor Programs for Deep Learning by Lianmin Zheng et al., OSDI 2020
- Schedule Synthesis for Halide Pipelines on GPUs by Sioutas Savvas et al., TACO 2020
- FlexTensor: An Automatic Schedule Exploration and Optimization Framework for Tensor Computation on Heterogeneous System by Size Zheng et al., ASPLOS 2020
- ProTuner: Tuning Programs with Monte Carlo Tree Search by Ameer Haj-Ali et al., arXiv 2020
- AdaTune: Adaptive tensor program compilation made efficient by Menghao Li et al., NeurIPS 2020
- Optimizing the Memory Hierarchy by Compositing Automatic Transformations on Computations and Data by Jie Zhao et al., MICRO 2020
- Chameleon: Adaptive Code Optimization for Expedited Deep Neural Network Compilation by Byung Hoon Ahn et al., ICLR 2020
- Learning to Optimize Halide with Tree Search and Random Programs by Andrew Adams et al., SIGGRAPH 2019
- Learning to Optimize Tensor Programs by Tianqi Chen et al., NeurIPS 2018
- Automatically Scheduling Halide Image Processing Pipelines by Ravi Teja Mullapudi et al., SIGGRAPH 2016
- A Deep Learning Based Cost Model for Automatic Code Optimization in Tiramisu by Massinissa Merouani et al., Graduation Thesis 2020
- A Deep Learning Based Cost Model for Automatic Code Optimization by Riyadh Baghdadi et al., MLSys 2021
- A Learned Performance Model for the Tensor Processing Unit by Samuel J. Kaufman et al., MLSys 2021
- DYNATUNE: Dynamic Tensor Program Optimization in Deep Neural Network Compilation by Minjia Zhang et al., ICLR 2021
- MetaTune: Meta-Learning Based Cost Model for Fast and Efficient Auto-tuning Frameworks by Jaehun Ryu et al., arxiv 2021
- TenSet: A Large-scale Program Performance Dataset for Learned Tensor Compilers by Lianmin Zheng., NeurIPS 2021
- PolyDL: Polyhedral Optimizations for Creation of HighPerformance DL primitives by Sanket Tavarageri et al., arXiv 2020
- Automatic Generation of High-Performance Quantized Machine Learning Kernels by Meghan Cowan et al., CGO 2020
- Optimizing CNN Model Inference on CPUs by Yizhi Liu et al., ATC 2019
- Analytical cache modeling and tilesize optimization for tensor contractions by Rui Li et al., SC 19
- Analytical characterization and design space exploration for optimization of CNNs by Rui Li et al., ASPLOS 2021
- Fireiron: A Data-Movement-Aware Scheduling Language for GPUs by Bastian Hagedorn et al., PACT 2020
- Automatic Kernel Generation for Volta Tensor Cores by Somashekaracharya G. Bhaskaracharya et al., arXiv 2020
- AKG: Automatic Kernel Generation for Neural Processing Units using Polyhedral Transformations by Jie Zhao et al., PLDI 2021
- Optimizing DNN Computation Graph using Graph Substitutions by Jingzhi Fang et al., VLDB 2020
- Transferable Graph Optimizers for ML Compilers by Yanqi Zhou et al., NeurIPS 2020
- FusionStitching: Boosting Memory IntensiveComputations for Deep Learning Workloads by Zhen Zheng et al., arXiv 2020
- Nimble: Lightweight and Parallel GPU Task Scheduling for Deep Learning by Woosuk Kwon et al., Neurips 2020
- Equality Saturation for Tensor Graph Superoptimization by Yichen Yang et al., MLSys 2021
- IOS: An Inter-Operator Scheduler for CNN Acceleration by Yaoyao Ding et al., MLSys 2021
- Nimble: Efficiently Compiling Dynamic Neural Networks for Model Inference by Haichen Shen et al., MLSys 2021
- Cortex: A Compiler for Recursive Deep Learning Models by Pratik Fegade et al., MLSys 2021
We encourage all contributions to this repository. Open an issue or send a pull request.