My Implementation of Q-Sparse: All Large Language Models can be Fully Sparsely-Activated
Primary LanguagePythonMIT LicenseMIT