Questions about RFT Inference

Question

Questions about RFT Inference

waterhorse1 opened this issue a year ago · 5 comments

Thanks for this great work. I have two questions: the first one is that the generation code for 7b/13b seems to be missing. The second is about the specific hyperparameter settings. The default hyperparameters set in single_inference_30b.py are not reasonable for generating different reasoning paths.

Thank you for your help!

GanjinZero commented a year ago

sampling

Answer 1 · 2023-08-08T00:53:41.000Z

You want to check group_7b_13b.sh. We have discussed in the paper, if you use temp=0.7 for 33b, you will generate like 2 different paths for 100 sampling times. If you use temp=1.0, you will have 4 different paths for 100 sampling times.

Answer 2 · 2023-08-08T00:54:31.000Z

I will upload gen_train.sh later.

Answer 3 · 2023-08-08T01:53:45.000Z

@GanjinZero What kind of decoding strategy are you using, direct sampling or beam search?

Answer 4 · 2023-08-08T02:32:52.000Z

thanks for your answer! I will close the issue.