Use a JSON configuration file to set up an evaluation run

Question

Use a JSON configuration file to set up an evaluation run

ruiAzevedo19 opened this issue 5 months ago · 1 comments

Goal: Allow to export and update a JSON configuration file to run an evaluation. We want to automatically see which models/repositories/tasks are new/gone. This allows us to commit the configuration we are using for a full evaluation for a eval version into the repository.

We want to store:

available models for providers
selected models for providers
available repositories (with their tasks)
selected repositories (with their tasks)

We want to load (?):

selected models for providers
selected repositories

TODO

Iteration 1

Return all available models by querying the provider's APIs
- Provider.Models()
- Providers
  - ~~OpenRouter: https://openrouter.ai/api/v1/models~~ already implemented
  - ~~Ollama: http://127.0.0.1:11434/api/tags~~ already implemented
- Locally available Ollama models don't say anything about which models are generally available, only which ones are locally available, so ignore Ollama for now (until #283 is in).

Iteration 2

Store the available models in JSON file
Store the selected models in JSON file

Iteration 3

Store the available repositories (with tasks) in JSON file
Store the selected repositories (with tasks) in JSON file

Iteration 4

Handle JSON file as configuration argument to the evaluation
- load selected models
- load selected repositories

Iteration 5

~~Also store and load custom provider urls so that they don't need to be carried over manually~~ Follow-up: #307

Answer 1 · 2024-07-24T06:33:06.000Z

@ahumenberger plz check if this makes sense to u 🙏