Evaluate your Instruction-Tuned models on open Benchmarks, with just one command.
Primary LanguagePython