junkangwu/Dr_DPO

Towards Robust Alignment of Language Models: Distributionally Robustifying Direct Preference Optimization

Python

Readme
1Issue
8Stargazers
1Watcher

Watchers

junkangwu
University of Science and Technology of China

Contact site admin: Geeks.