Vance0124/Token-level-Direct-Preference-Optimization
Reference implementation for Token-level Direct Preference Optimization(TDPO)
PythonApache-2.0
Stargazers
- AGTSAAA
- alexandonianMIT CSAIL
- CMander02
- dantruonghtno1Hà Nội
- davidchern
- DiXue98NJU
- fiberleifMicrosoft
- fly51flyPRIS
- forrestbingAlibaba Inc
- hanyang1999
- HillZhang1999Bytedance
- histmeisahUCAS
- huang-xxByteDance
- iamtatsuki05Tokyo
- jinnaiyuuTokyo
- jyweky
- KyeongpilScatter Lab
- LuckerYi
- MasterVitoTsinghua University
- Mr-Loevan
- Olivia-fsmEcole Polytech Federal of Lausanne
- peterjc123Shanghai, China
- SAMUSENPS
- sandkoan
- Shylock-HNanjing University
- superboySBBeijing Institute of Technology
- SushantDaga
- TongLi3701Colossal-AI Team @hpcaitech
- tyshiwo1
- vicgalleKomorebi AI & ICMAT-CSIC
- winglianAnnapolis, MD
- xiaojingliMunich,Germany
- zhanghaoie
- zhaochen0110Soochow University
- zhimin-zSoftware Analysis and Intelligence Lab
- zpschang