eric-mitchell/direct-preference-optimization
Reference implementation for DPO (Direct Preference Optimization)
PythonApache-2.0
Stargazers
- abodacsOpenCoast
- aleksa-sukovicNatif AI
- anisha2102University of Texas at Austin
- atemaguerPalo Alto, California
- bharathh4
- BruthYUICALK, ECNU
- BunsenFengSeattle, US
- Ch4osMy7hAlibaba
- ctlllll@Princeton
- decahedron1@pykeio
- fly51flyPRIS
- Glavin001Remote Software Developer
- gsartiUniversity of Groningen
- hsvgbkhgbvUniversity of Bristol / University of Manchester
- huey2531
- hysts
- juand-r
- kashifBerlin, Germany
- kaustubhsridhar
- khursani8Kuala Lumpur
- liuqi8827Harbin Institute of Technology
- liutianlin0121Basel
- mikebern
- ngold5Shara
- Peng-YMChina
- prateeky2806
- ravindra-ut
- raymondng76AI Singapore
- rosikandStanford University
- sanglinweiTsinghua University
- stjordanisGreece
- strandlineHouston, Texas
- tokestermwCresta
- xukkx
- Yannlecun
- YueeeeeeeeUIUC