question on "Revisit Multi-Choice Question Benchmarks"
imhmhm opened this issue · 1 comments
imhmhm commented
Thanks so much for sharing the findings and insights about "Multi-Choice Question Benchmarks", I have a quick question about the 20 million Chinese MC data leading to overfiting without generalizing to other tasks, are the data composed of questions with pure options OR with sort of explanations in the answers?
Thank you again for your great work!
DeepSeekPH commented
Pure options