eric-mitchell/direct-preference-optimization
Reference implementation for DPO (Direct Preference Optimization)
PythonApache-2.0
Watchers
- bugfaceUS
- CamaradaLares
- eemailme
- egafniSan Francisco, CA
- elabecaLondon
- eric-mitchellStanford, CA
- erickmillerCoinCircle
- janphilippfranken
- mbofb
- michalwolsNew York
- modelriskanalytics
- shuyueW1991Talent Medicore Holdings Group
- songkq
- tatianasolonets
- vgoklaniNew York, NY
- VonRosenchildHubBucket, Inc.
- wookayinUniversity of Michigan
- xukkx
- zebrajackpungke