salesforce/CodeRL

How to generate Critic Scores that can mimic a reward model

Opened this issue · 2 comments

Hello, hope all is well,

Wanted to ask how to generate critic scores for a solution of a code problem, is there a way instead of just classifying them using the critic model?

YTJY commented

Hello, I don't know if you solved this problem, I'm also experiencing this problem now, can you give me some advice

YTJY commented

When I used the critic model to score the generated code, I found that the effect was very poor, I don't know if I made a mistake, I wonder if you have ever encountered this situation