Soft Mixture of Experts Vision Transformer, addressing MoE limitations as highlighted by Puigcerver et al., 2023.
Primary LanguagePython