Log model responses directly to file and reuse them for debugging
bauersimon opened this issue · 1 comments
bauersimon commented
Goal, be able to use exactly 1:1 responses from a previous run to debug the evaluation logic.
- log model responses directly to files (either on provider query response level or generate test level)
- add dummy model that takes these files and responds accordingly (essentially mimicking/replaying the original model responses)
ahumenberger commented
Duplicate of #204. Closing.