junkangwu/beta-DPO

[NeurIPS 2024] Official code of $\beta$-DPO: Direct Preference Optimization with Dynamic $\beta$

Python

Issues

Lines 93 and 94 in Trainer.py don’t seem to use beta? Am I missing something?
#1 opened 5 months ago by geighz
1