Opened this issue 5 months ago · 0 comments
TIM/sft_reward_training/trainer/utils/model/reward_model.py
Line 167 in 5876473