How to generate Critic Scores that can mimic a reward model
Opened this issue · 2 comments
AhmedKhaled945 commented
Hello, hope all is well,
Wanted to ask how to generate critic scores for a solution of a code problem, is there a way instead of just classifying them using the critic model?
YTJY commented
Hello, I don't know if you solved this problem, I'm also experiencing this problem now, can you give me some advice
YTJY commented
When I used the critic model to score the generated code, I found that the effect was very poor, I don't know if I made a mistake, I wonder if you have ever encountered this situation