/Token-level-Direct-Preference-Optimization

Reference implementation for Token-level Direct Preference Optimization(TDPO)

Primary LanguagePythonApache License 2.0Apache-2.0

Watchers