/cluster-monitoring-operator

Manage the OpenShift monitoring stack

Primary LanguageGoApache License 2.0Apache-2.0

Cluster Monitoring Operator

The Cluster Monitoring Operator manages and updates the Prometheus-based cluster monitoring stack deployed on top of OpenShift.

It contains the following components:

The deployed Prometheus Operator is intended to be used only for cluster-level monitoring. As such, the deployed Prometheus instance (prometheus-k8s) is responsible for monitoring and alerting on cluster and OpenShift components; it should not be extended to monitor user applications. Important: The Prometheus Operator managed by the Cluster Monitoring Operator will by default only look for ServiceMonitor resources in openshift-monitoring namespace.

Users interested in leveraging Prometheus for application monitoring on OpenShift should consider using OLM to easily deploy a Prometheus Operator and setup new Prometheus instances to monitor and alert on their applications.

Alertmanager is a cluster-global component for handling alerts generated by all Prometheus instances deployed in that cluster.

Metrics are collected from the following components:

Adding new metrics to be sent via telemetry

To add new metrics to be sent via telemetry, simply add a selector that matches the time-series to be sent in manifests/0000_50_cluster_monitoring_operator_04-config.yaml.

Documentation on the data sent can be found in the data collection documentation.

Roadmap

  • Monitor etcd
  • Adapt Tectonic inherited alerts with OpenShift operational knowledge

Testing

  • Unit tests: make test-unit

  • End-to-end tests: make test-e2e

Contributing

Please refer to the CONTRIBUTING.md document for information.

Release

Before a new OpenShift release happens make sure to pin the dependencies to the release branches:

  1. In kube-prometheus cut a release.
  2. In this repo set the "version" in jsonnet/jsonnetfile.json to the release branches for all the dependencies.