/qwen-dpo

通义千问的DPO训练

Primary LanguageJupyter Notebook

Stargazers