KTO training with datasets in alpaca format

Question

KTO training with datasets in alpaca format

Cheungki opened this issue 22 days ago · 5 comments

Nice work!

I'm glad to find that LLaMA-Factory supports KTO training. But training with datasets in alpaca format will lead to an error that all datapoints will be described as desired examples. A possible reason might be that examples["response"][i][0]["content"] here will always be true.

hiyouga commented 22 days ago

fixed

Answer 1 · 2024-05-18T07:58:59.000Z

ok I'll fix it, thanks for pointing it out

Answer 2 · 2024-05-18T15:36:39.000Z

fixed

Where is kto_chosen_weight and kto_rejected_weight in ui ?
And if will add a auto calculate logic of this two value based on ratio between chosen and rejected sample ?

Answer 3 · 2024-05-18T16:05:16.000Z

@svjack the webui will be updated later

Answer 4 · 2024-05-19T02:49:21.000Z

@svjack the webui will be updated later

是否存在根据正副样本比例计算一个相对稳健的两个权重的方法呢