Awesome-Mixture-of-Experts-Papers

Awesome-Mixture-of-Experts-Papers is a curated list of Mixture-of-Experts (MoE) papers in recent years. Star this repository, and then you can keep abreast of the latest developments of this booming research field.

Thanks to all the people who made contributions to this project. We strongly encourage the researchers to make pull request (e.g., add missing papers, fix errors) and help the others in this community!

Algorithm

2021

Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity [pdf] arXiv 2021

William Fedus, Barret Zoph, Noam Shazeer
DSelect-k: Differentiable Selection in the Mixture of Experts with Applications to Multi-Task Learning [pdf] NeurIPS 2021

Hussein Hazimeh, Zhe Zhao, Aakanksha Chowdhery, Maheswaran Sathiamoorthy, Yihua Chen, Rahul Mazumder, Lichan Hong, Ed H. Chi

System

2021

FastMoE: A Fast Mixture-of-Expert Training System [pdf] arXiv 2021

Jiaao He, Jiezhong Qiu, Aohan Zeng, Zhilin Yang, Jidong Zhai, Jie Tang

2022

DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to Power Next-Generation AI Scale [pdf] arXiv 2022

Samyam Rajbhandari, Conglong Li, Zhewei Yao, Minjia Zhang, Reza Yazdani Aminabadi, Ammar Ahmad Awan, Jeff Rasley, Yuxiong He

Application

2018

Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts [pdf] SIGKDD 2018

Jiaqi Ma, Zhe Zhao, Xinyang Yi, Jilin Chen, Lichan Hong, Ed Chi

2021

Video Recommendation with Multi-gate Mixture of Experts Soft Actor Critic [pdf] SIGIR 2021

Dingcheng Li, Xu Li, Jun Wang, Ping Li

Contributed by Xiaonan Nie, Xupeng Miao, Qibin Liu and Hetu team members.

Hsword/Awesome-Mixture-of-Experts-Papers

Awesome-Mixture-of-Experts-Papers

Algorithm

2021

System

2021

2022

Application

2018

2021