The Cluster Monitoring Operator manages and updates the Prometheus-based cluster monitoring stack deployed on top of OpenShift.
It contains the following components:
- Prometheus Operator
- Prometheus
- Alertmanager cluster for cluster and application level alerting
- kube-state-metrics
- node_exporter
The deployed Prometheus Operator is intended to be used only for cluster-level monitoring.
As such, the deployed Prometheus instance (prometheus-k8s
) is responsible for monitoring and alerting on cluster and OpenShift components; it should not be extended to monitor user applications.
Important: The Prometheus Operator managed by the Cluster Monitoring Operator will by default only look for ServiceMonitor
resources in openshift-monitoring
namespace.
Users interested in leveraging Prometheus for application monitoring on OpenShift should consider using OLM to easily deploy a Prometheus Operator and setup new Prometheus instances to monitor and alert on their applications.
Alertmanager is a cluster-global component for handling alerts generated by all Prometheus instances deployed in that cluster.
Metrics are collected from the following components:
- kube-state-metrics
- node_exporter
- Kubelets
- API server
- Prometheus (just
prometheus-k8s
for now) - Alertmanager
- Telemeter
The Cluster Monitoring Operator has many builtin ServiceMonitor
resources which enable discovering the metrics endpoints of a variety of well-known components. Only components that must be created before the cluster monitoring stack belong in this repository, in order to solve the cyclic dependencies of bootstrapping.
To register a new builtin component, make the following changes:
- Add a new
ServiceMonitor
manifest file to jsonnet/prometheus.jsonnet. An example of this can be seen for the OpenShift component "kube-controllers", here. - Re-generate the go-bindata code, using the
pkg/manifests/bindata.go
make target. This will also create a new file inassets/prometheus-k8s/
according to the name given in the jsonnet code. - Add a constant in pkg/manifests/manifests.go which points to the new manifest file, from
assets/
. - Add a new
Factory
method in pkg/manifests/manifests.go which loads the manifest using the new constant. - Add a step to
PrometheusTask
in pkg/tasks/prometheus.go which creates theServiceMonitor
using theFactory
new method.
To add new builtin recording or alerting rules:
- Add a new Prometheus rules file to jsonnet/rules.jsonnet.
Run make pkg/manifests/bindata.go
after you modify the files and make sure to add the modified files to the commit. All rules are automatically created, so no additional code changes are necessary.
- Monitor etcd
- Adapt Tectonic inherited alerts with OpenShift operational knowledge
Run e2e-tests with make e2e-test
.
Clean up after e2e-tests with make e2e-clean