/soft-mixture-of-experts

PyTorch implementation of Soft MoE by Google Brain in "From Sparse to Soft Mixtures of Experts" (https://arxiv.org/pdf/2308.00951.pdf)

Primary LanguagePythonMIT LicenseMIT

Stargazers