Pinned Repositories
inspect_ai
Inspect: A framework for large language model evaluations
inspect_evals
Collection of evals for the Inspect evaluation framework
tau-bench
Code and Data for Tau-Bench
evanmiller-anthropic's Repositories
evanmiller-anthropic/inspect_evals
Collection of evals for the Inspect evaluation framework
evanmiller-anthropic/inspect_ai
Inspect: A framework for large language model evaluations
evanmiller-anthropic/tau-bench
Code and Data for Tau-Bench