/MoE-SFT

🍼 Official implementation of Dynamic Data Mixing Maximizes Instruction Tuning for Mixture-of-Experts

Primary LanguagePythonApache License 2.0Apache-2.0

Stargazers