chandar-lab/EpiK-Eval
Benchmark to evaluate the capability of language models to consolidate and recall information from multiple training documents.
PythonMIT
No issues in this repository yet.
Benchmark to evaluate the capability of language models to consolidate and recall information from multiple training documents.
PythonMIT
No issues in this repository yet.