hiyouga/LLaMA-Factory

KTO training with datasets in alpaca format

Cheungki opened this issue · 5 comments

Nice work!

I'm glad to find that LLaMA-Factory supports KTO training. But training with datasets in alpaca format will lead to an error that all datapoints will be described as desired examples. A possible reason might be that examples["response"][i][0]["content"] here will always be true.

ok I'll fix it, thanks for pointing it out

fixed

fixed

Where is kto_chosen_weight and kto_rejected_weight in ui ?
And if will add a auto calculate logic of this two value based on ratio between chosen and rejected sample ?

@svjack the webui will be updated later

@svjack the webui will be updated later

是否存在根据正副样本比例计算一个相对稳健的两个权重的方法呢