Fill out terraform-setup/terraform.tfvars
with aws and digital ocean credentials.
Create infrastructure and install Rancher
make install
Go to https://rancher-demo.plgrnd.be/login and set up admin password and server url.
- In the local cluster, go to Cluster Explorer -> Apps & Marketplace.
- Install the rancher-monitoring app with the default settings
! NOTE: On Rancher <2.5.2, install from dev-v2.5 branch, enable k3s in yaml, increase memory limit to 3500Mi
- See the helm install output
- Have a look at the installed workloads in
cattle-monitoring-system
. - Have a look at Prometheus
- Scraping targets
- Built-in alerting rules
- Have a look at Grafana
- Built-in dashboards
- Logging into Grafana as admin
- Deploy shop
kubectl -n default apply -f scrape-custom-service/01-demo-shop.yaml
-
Have a look at shop
-
Add prometheus-exporter to redis deployment
kubectl -n default apply -f scrape-custom-service/02-redis-prometheus-exporter.yaml
- Add ServiceMonitor for redis deployment
kubectl -n default apply -f scrape-custom-service/03-redis-servicemonitor.yaml
-
See that Prometheus starts scraping Redis
-
Add Redis Grafana dashboard
kubectl apply -f scrape-custom-service/04-redis-grafana-dashboard.yaml
- Add Redis PrometheusRule
kubectl -n default apply -f scrape-custom-service/05-redis-prometheus-rules.yaml
-
Configure alertmanager
-
Force alert
kubectl -n default apply -f scrape-custom-service/05-redis-prometheus-rules-force-alert.yaml```
-
See that alert fires
-
Reset
kubectl -n default apply -f scrape-custom-service/05-redis-prometheus-rules.yaml
helm repo add presslabs https://presslabs.github.io/charts
helm upgrade --install mysql-operator presslabs/mysql-operator --namespace mysql-operator --set serviceMonitor.enabled=true --create-namespace
Add db
kubectl apply -f scrape-custom-service/06-mysql-cluster.yaml
Add rules and dashboard
kubectl apply -f scrape-custom-service/07-mysql-rules.yaml
kubectl apply -f scrape-custom-service/08-mysql-grafana-dashboard.yaml
-
Install rancher-logging
-
Install loki
helm upgrade --install loki loki/loki --namespace loki -f logging/loki-values.yaml --create-namespace
- Add grafana datasource
kubectl apply -f logging/datasource.yaml
kubectl rollout restart deployment -n cattle-monitoring-system rancher-monitoring-grafana
- Add ClusterFlow and Output
kubectl apply -f logging/logging-cluster-flow.yaml
Wait a bit and show logs in Grafana Explorer for {namespace="default"}
- Deploy sample app
kubectl -n default apply -f custom-metrics-hpa/deployment.yaml
Deploy HPA
kubectl -n default apply -f custom-metrics-hpa/hpa.yaml
Create load at https://sample-app.plgrnd.be/
watch kubectl describe hpa -n default
watch kubectl get pods -n default
- Notifiers
- Dashboards
- Non Prometheus Query Alerts
- Prometheus Query Alerts
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: custom-rules
namespace: default
spec:
groups:
- name: custom.rules
rules:
- alert: Deployment with unavailable replicas
expr: kube_deployment_status_replicas_unavailable > 0
for: 5m
labels:
severity: critical
annotations:
summary: "Deployment {{ $labels.namespace }}/{{ $labels.deployment }} has {{ $value }} unavailable replicas"
kubectl label namespace default istio-injection=enabled
Redeploy everything