The Cluster Monitoring Operator manages and updates the Prometheus-based monitoring stack deployed on top of OpenShift.
It contains the following components:
- Prometheus Operator
- Prometheus
- Alertmanager cluster for cluster and application level alerting
- kube-state-metrics
- node_exporter
The deployed Prometheus Operator is meant to be leveraged by users to easily deploy new Prometheus setup for their application monitoring.
The Prometheus instance (prometheus-k8s
) is responsible for monitoring and alerting on cluster and OpenShift components. It should not be extended to monitor user applications.
Alertmanager is a cluster-global component for handling alerts generated by all Prometheus instances deployed in that cluster.
Metrics are collected from the following components
- kube-state-metrics
- node_exporter
- Kubelets
- API server
- Prometheus (just
prometheus-k8s
for now) - Alertmanager
Important: The Prometheus Operator managed by the Cluster Monitoring Operator will by default only look for ServiceMonitor
resources in namespaces containing an openshift.io/cluster-monitoring
label (with any value).
The Cluster Monitoring Operator has many builtin ServiceMonitor
resources which enable discovering the metrics endpoints of a variety of well-known components. Only components that must be created before the cluster monitoring stack belong in this repository, in order to solve the cyclic dependencies of bootstrapping.
To register a new builtin component, make the following changes:
- Add a new
ServiceMonitor
manifest file to jsonnet/prometheus.jsonnet. An example of this can be seen for the OpenShift component "kube-controllers", here. - Re-generate the go-bindata code, using the
pkg/manifests/bindata.go
make target. This will also create a new file inassets/prometheus-k8s/
according to the name given in the jsonnet code. - Add a constant in pkg/manifests/manifests.go which points to the new manifest file, from
assets/
. - Add a new
Factory
method in pkg/manifests/manifests.go which loads the manifest using the new constant. - Add a step to
PrometheusTask
in pkg/tasks/prometheus.go which creates theServiceMonitor
using theFactory
new method.
To add new builtin recording or alerting rules:
- Add a new Prometheus rules file to jsonnet/rules.jsonnet.
Run make pkg/manifests/bindata.go
after you modify the files and make sure to add the modified files to the commit. All rules are automatically created, so no additional code changes are necessary.
- Monitor etcd
- Adapt Tectonic inherited alerts with OpenShift operational knowledge
Run e2e-tests with make e2e-test
.
Clean up after e2e-tests with make e2e-clean