/openshift-prometheus

Repository for all things related to Prometheus on OpenShift

Primary LanguageShell

Prometheus on OpenShift

This repository contains definitions and tools to run Prometheus and its associated ecosystem on Red Hat OpenShift.

Components

The following components are available:

Project Organization

A new project called prometheus will be created to contain the entire ecosystem.

Execute the following command to create the project:

oc new-project prometheus --display-name="Prometheus Monitoring"

Make sure that there is not a default node selector on the project:

oc annotate namespace prometheus openshift.io/node-selector=""

Deploy Prometheus

Starting with OpenShift 3.6 the OpenShift routers expose a metrics endpoint on port 1936. For Prometheus to be able to monitor the routers this port needs to be open.

Additionally Prometheus does not work with remote volumes (NFS, EBS, …​) but needs local disk storage as well. This means we need to create a directory on (one of) the infranodes. The Prometheus template includes a Node Selector prometheus-host=true - so we need to set the correct label on the infranode(s) as well.

Run the following Ansible playbook to configure the infranodes:

ansible-playbook -i /etc/ansible/hosts ./setup_infranodes.yml

The router also requires basic authentication to be allowed to scrape the metrics. Find the router password by executing the following command:

oc set env dc router -n default --list|grep STATS_PASSWORD|awk -F"=" '{print $2}'

An OpenShift template has been provided to streamline the deployment to OpenShift.

Execute the following command to instantiate the Prometheus template using the previously retrieved router password as a parameter:

oc new-app -f prometheus.yaml --param ROUTER_PASSWORD=<Router Password>

Since Prometheus needs to use a local disk to write its metrics add the privileged SCC to the prometheus service account:

oc adm policy add-scc-to-user privileged system:serviceaccount:prometheus:prometheus

Make sure your Prometheus pod is running (on an Infranode):

oc get pod -o wide

Next Steps

Please refer to the following to enhance the functionality of Prometheus

Cleanup

Delete the project and the cluster-reader binding (which gets created by the template but doesn’t get deleted as part of the project):

oc delete project prometheus
oc delete clusterrolebinding prometheus-cluster-reader
oc adm policy remove-scc-from-user privileged prometheus

You will also need to clean up the directory /var/lib/prometheus-data on the Infranode(s) and remove the label prometheus-host=true from the Infranode(s).