Aim of this repository is to create evals for multi-turn dialogues, using AISI's Inspect open source framework. If you know nothing about Inspect, I wrote a beginner's guide going through their hello world example. You can also just read their documentation.
multi_dialogues.ipynb
. Main file for this repository which contains example eval workflow.question.jsonl
. Example dialogue evals copied from MT Bench..gitignore
,LICENSE
,README.md
- standard files for a github repo.
Issues or pull requests are welcome. Also happy to hear from people who want to collaborate to flesh out the toy example to make this more reusable and flexible.