hsiehjackson/RULER

Gemini flash 1.5 results

augusto-rehfeldt opened this issue · 1 comments

Does anyone have the results for this model? Seems to hallucinate quite a lot in long context prompts, even though it has a context size of a million tokens.

Thanks.