EinStack/glide

✨ 🔧 [Telemetry] Add metrics to measure health, latency, request rate to model providers

roma-glushko opened this issue · 1 comments

Measure these metrics for all model providers configured:

  • the number of successful requests (counter)
  • the number of failed requests (counter)
  • the response latency (non-streaming lang chat requests)
  • request rate to each provider
  • the first chunk latency (streaming lang chat requests)

request rate to each provider

rate is a derived metric, it is computed by the time series store. Having total_requests counter is enough, you can compute rate of requests from it.