A mistral long context - MegaBeam-Mistral-512K

Question

A mistral long context - MegaBeam-Mistral-512K

chenwuperth opened this issue 3 months ago · 2 comments

Hi, thanks for the project! could you please evaluate https://huggingface.co/aws-prototyping/MegaBeam-Mistral-7B-512k on the latest RULER benchmark. Thanks!

Answer 1 · 2024-08-08T17:29:13.000Z

Sure! I put the results on the leaderboard (under our evaluation) although I saw you have tested on your own. This is a pretty good long-context model. It would be great if we can have numbers to show its short context performance (MMLU, MTBench, or something on open llm leaderboard).

Answer 2 · 2024-08-09T03:01:20.000Z

Thank you for testing it! Yes, I just wanted to confirm if our eval is consistent with yours (which appears to be the case). I will take a look at the short context benchmark although we have focused solely on the "long" context when training this model.