✨ 🔧 [Telemetry] Add metrics to measure health, latency, request rate to model providers
roma-glushko opened this issue · 1 comments
roma-glushko commented
Measure these metrics for all model providers configured:
- the number of successful requests (counter)
- the number of failed requests (counter)
- the response latency (non-streaming lang chat requests)
- request rate to each provider
- the first chunk latency (streaming lang chat requests)
gernest commented
request rate to each provider
rate is a derived metric, it is computed by the time series store. Having total_requests counter is enough, you can compute rate of requests from it.