deepseek-ai/DeepSeek-LLM

question on "Revisit Multi-Choice Question Benchmarks"

imhmhm opened this issue · 1 comments

imhmhm commented

Thanks so much for sharing the findings and insights about "Multi-Choice Question Benchmarks", I have a quick question about the 20 million Chinese MC data leading to overfiting without generalizing to other tasks, are the data composed of questions with pure options OR with sort of explanations in the answers?

Thank you again for your great work!

Pure options