Evals meant to evaluate language models' ability to reason over long contexts.
Primary LanguagePythonMIT LicenseMIT