NVIDIA/tensorrt-laboratory

Improve Metrics and Dashboard

ryanolson opened this issue · 0 comments

Possible Metrics and Status

  • batches / second (counter/rate)
  • inference / second (counter/rate)
  • gpu power (gauge)
  • queue depth (gauge)
  • request time (summary quantile 50/90/99)
  • compute time (summary quantile 50/90/99)
  • load_ratio [request time / compute_time] (histogram: buckets [2, 4, 10, 100, 1000])

Grafana panels needs serious work. Anyone have a good way to visualize Prometheus histograms with Grafana?