KTO training with datasets in alpaca format
Cheungki opened this issue · 5 comments
Cheungki commented
Nice work!
I'm glad to find that LLaMA-Factory supports KTO training. But training with datasets in alpaca format will lead to an error that all datapoints will be described as desired examples. A possible reason might be that examples["response"][i][0]["content"]
here will always be true.
hiyouga commented
ok I'll fix it, thanks for pointing it out
hiyouga commented
fixed
svjack commented
fixed
Where is kto_chosen_weight and kto_rejected_weight in ui ?
And if will add a auto calculate logic of this two value based on ratio between chosen and rejected sample ?