Awesome-Mixture-of-Experts-Papers is a curated list of Mixture-of-Experts (MoE) papers in recent years. Star this repository, and then you can keep abreast of the latest developments of this booming research field.
Thanks to all the people who made contributions to this project. We strongly encourage the researchers to make pull request (e.g., add missing papers, fix errors) and help the others in this community!
-
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity [pdf] arXiv 2021
William Fedus, Barret Zoph, Noam Shazeer
-
DSelect-k: Differentiable Selection in the Mixture of Experts with Applications to Multi-Task Learning [pdf] NeurIPS 2021
Hussein Hazimeh, Zhe Zhao, Aakanksha Chowdhery, Maheswaran Sathiamoorthy, Yihua Chen, Rahul Mazumder, Lichan Hong, Ed H. Chi
-
FastMoE: A Fast Mixture-of-Expert Training System [pdf] arXiv 2021
Jiaao He, Jiezhong Qiu, Aohan Zeng, Zhilin Yang, Jidong Zhai, Jie Tang
-
DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to Power Next-Generation AI Scale [pdf] arXiv 2022
Samyam Rajbhandari, Conglong Li, Zhewei Yao, Minjia Zhang, Reza Yazdani Aminabadi, Ammar Ahmad Awan, Jeff Rasley, Yuxiong He
-
Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts [pdf] SIGKDD 2018
Jiaqi Ma, Zhe Zhao, Xinyang Yi, Jilin Chen, Lichan Hong, Ed Chi
-
Video Recommendation with Multi-gate Mixture of Experts Soft Actor Critic [pdf] SIGIR 2021
Dingcheng Li, Xu Li, Jun Wang, Ping Li
Contributed by Xiaonan Nie, Xupeng Miao, Qibin Liu and Hetu team members.