symflower/eval-dev-quality

Include metrics about the models for comparing models

Opened this issue · 0 comments

Tasks:

  • costs per dollar for request / response / prompt (per 1000token) / completion (per 1000token)
  • open-weight bool
  • license string
  • commercial-use-allowed bool
  • Vendor (aka company) string so model can be color coded in graphs
  • Based on string e.g. https://huggingface.co/teknium/OpenHermes-2-Mistral-7B is Finetuned from mistralai/Mistral-7B-v0.1. This helps to identifiy if a model is better or worse than its original (noteworthy)
  • Context