model evaluation

Question

model evaluation

mactavish91 opened this issue a year ago · 2 comments

Thank you for your great evaluation, we have recently used a training strategy similar to LLava, which co-trains vqa and chat data, resulting in significant improvements. Could you re evaluate our model? https://github.com/THUDM/CogVLM/

Answer 1 · 2023-12-01T06:09:08.000Z

Thanks! We are happy to reevaluate your model again. On the other hand, it would be great if you could run the evaluation yourself and provide your validation and test set predictions. Then we can update the leaderboard with a fair and accurate score. Thanks, Xiang Outlook for iOS<https://aka.ms/o0ukef>

…

________________________________ From: mactavish91 ***@***.***> Sent: Friday, December 1, 2023 12:55:58 AM To: MMMU-Benchmark/MMMU ***@***.***> Cc: Subscribed ***@***.***> Subject: [MMMU-Benchmark/MMMU] model evaluation (Issue #3) Thank you for your great evaluation, we have recently used a training strategy similar to LLava, which co-trains vqa and chat data, resulting in significant improvements. Can you re evaluate our model? https: //github. com/THUDM/CogVLM/ —Reply Thank you for your great evaluation, we have recently used a training strategy similar to LLava, which co-trains vqa and chat data, resulting in significant improvements. Can you re evaluate our model? https://github.com/THUDM/CogVLM/<https://urldefense.com/v3/__https://github.com/THUDM/CogVLM/__;!!KGKeukY!xypCutw-8lR8kSlup8D3wycxuEm_evM_fM_k-AvFwq5CExRgsLZK4oqAYtFs5BOfky-z4F4oQlDyZ_0L0kfMcYOjUw$> — Reply to this email directly, view it on GitHub<https://urldefense.com/v3/__https://github.com/MMMU-Benchmark/MMMU/issues/3__;!!KGKeukY!xypCutw-8lR8kSlup8D3wycxuEm_evM_fM_k-AvFwq5CExRgsLZK4oqAYtFs5BOfky-z4F4oQlDyZ_0L0kc0vkyDSw$>, or unsubscribe<https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/ADRC4H5VOAXALW5ME3LBVZDYHFWO5AVCNFSM6AAAAABACKBCKOVHI2DSMVQWIX3LMV43ASLTON2WKOZSGAZDAMJVGIYDQMA__;!!KGKeukY!xypCutw-8lR8kSlup8D3wycxuEm_evM_fM_k-AvFwq5CExRgsLZK4oqAYVFs5BOf0eR5hwrBvGGAdLJgQchEgC7Mfg$>. You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

Answer 2 · 2023-12-06T14:36:13.000Z

You can submit your test set evaluation at eval.ai: https://eval.ai/web/challenges/challenge-page/2179. We will update your results in our paper based on your submission.