Code examples for Master Class: Monitoring and alerting in Rancher 2.5

Installation steps

Fill out terraform-setup/terraform.tfvars with aws and digital ocean credentials.

Create infrastructure and install Rancher

make install

Configure Rancher

Go to https://rancher-demo.plgrnd.be/login and set up admin password and server url.

Install rancher-monitoring

In the local cluster, go to Cluster Explorer -> Apps & Marketplace.
Install the rancher-monitoring app with the default settings

! NOTE: On Rancher <2.5.2, install from dev-v2.5 branch, enable k3s in yaml, increase memory limit to 3500Mi

Explore rancher-monitoring

See the helm install output
Have a look at the installed workloads in cattle-monitoring-system.
Have a look at Prometheus
- Scraping targets
- Built-in alerting rules
Have a look at Grafana
- Built-in dashboards
- Logging into Grafana as admin

Deploy your own workload and scrape it

Deploy shop

kubectl -n default apply -f scrape-custom-service/01-demo-shop.yaml

Have a look at shop
Add prometheus-exporter to redis deployment

kubectl -n default apply -f scrape-custom-service/02-redis-prometheus-exporter.yaml

Add ServiceMonitor for redis deployment

kubectl -n default apply -f scrape-custom-service/03-redis-servicemonitor.yaml

See that Prometheus starts scraping Redis
Add Redis Grafana dashboard

kubectl apply -f scrape-custom-service/04-redis-grafana-dashboard.yaml

Add Redis PrometheusRule

kubectl -n default apply -f scrape-custom-service/05-redis-prometheus-rules.yaml

Configure alertmanager
Force alert

 kubectl -n default apply -f scrape-custom-service/05-redis-prometheus-rules-force-alert.yaml```

See that alert fires
Reset

kubectl -n default apply -f scrape-custom-service/05-redis-prometheus-rules.yaml

More and more helm charts already include ServiceMonitors

helm repo add presslabs https://presslabs.github.io/charts
helm upgrade --install mysql-operator presslabs/mysql-operator --namespace mysql-operator --set serviceMonitor.enabled=true --create-namespace

Add db

kubectl apply -f scrape-custom-service/06-mysql-cluster.yaml

Add rules and dashboard

kubectl apply -f scrape-custom-service/07-mysql-rules.yaml
kubectl apply -f scrape-custom-service/08-mysql-grafana-dashboard.yaml

Logging

Install rancher-logging
Install loki

helm upgrade --install loki loki/loki --namespace loki -f logging/loki-values.yaml --create-namespace

Add grafana datasource

kubectl apply -f logging/datasource.yaml
kubectl rollout restart deployment -n cattle-monitoring-system rancher-monitoring-grafana

Add ClusterFlow and Output

kubectl apply -f logging/logging-cluster-flow.yaml

Wait a bit and show logs in Grafana Explorer for {namespace="default"}

HPA

Deploy sample app

kubectl -n default apply -f custom-metrics-hpa/deployment.yaml

Deploy HPA

kubectl -n default apply -f custom-metrics-hpa/hpa.yaml

Create load at https://sample-app.plgrnd.be/

watch kubectl describe hpa -n default
watch kubectl get pods -n default

V1 to V2 migration

Notifiers
Dashboards
Non Prometheus Query Alerts
Prometheus Query Alerts

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: custom-rules
  namespace: default
spec:
  groups:
    - name: custom.rules
      rules:
        - alert: Deployment with unavailable replicas
          expr: kube_deployment_status_replicas_unavailable > 0
          for: 5m
          labels:
            severity: critical
          annotations:
            summary: "Deployment {{ $labels.namespace }}/{{ $labels.deployment }} has {{ $value }} unavailable replicas"

Istio

kubectl label namespace default istio-injection=enabled

Redeploy everything

bogdanro/rancher-2.5-monitoring