bringg/spark-nanny

[Feature request]: add prometheus metrics

Closed this issue · 0 comments

It would be nice to have prometheus metrics, so we can monitor/graph/alert on the things spark-nanny does.

Suggestions to metrics:

  • total duration took to poke the driver for all spark applications
  • pod kill counter per spark application
  • nice to have: spark-nanny version information metric