Add Prometheus infrastructure with basic metrics
MarkusH opened this issue · 0 comments
MarkusH commented
The de-factor monitoring in Kubernetes works through Prometheus. The CrateDB Kubernetes Operator should collect its own metrics and expose them for Prometheus.
For the beginning, it would be nice to track the following metrics:
- The total number of clusters deployed
- The total number of clusters deleted
- The number of clusters monitored by the operator
- The total number of times clusters were restarted
- The total number of times clusters were scaled
- The total number of times clusters were upgraded
Care must be taken with Kopf's idempotence: the kopf.on.*
handlers may fail and will be retried until they succeed, unless they fail permanently. The metrics above should only be updated when the events succeeded. Though there's a case to be made to track failures as well.