- OpenShift 4.12+
- kustomize or oc
- ServiceNow instance
This is a simple Flask application that receives alerts from an alert manager, and creates corresponding incidents in ServiceNow.
- Prometheus evaluates alert rules and sends firing alerts to AlertManager
- AlertManager routes the alerts to the webhook endpoint exposed by the Flask app
- The Flask app parses the alert data and calls the ServiceNow API to create an incident
- New incidents are created in ServiceNow for each alert
oc apply -k ./app
Quick deployment
oc apply -k https://github.com/tosin2013/alert-manager-to-service-now/app --dry-run=client -o yaml
Example Custom Alert
apiVersion: v1
data:
custom_rules.yaml: |
groups:
- name: kube-node-health
rules:
- alert: NodeNotReady
annotations:
summary: Notify when any node on a cluster is in NotReady state
description: "One of the node of the cluster is down: Cluster {{ $labels.cluster }} {{ $labels.clusterID }}."
expr: kube_node_status_condition{condition="Ready",job="kube-state-metrics",status="true"} != 1
for: 5s
labels:
instance: "{{ $labels.instance }}"
cluster: "{{ $labels.cluster }}"
clusterID: "{{ $labels.clusterID }}"
tag: kubenode
severity: critical
kind: ConfigMap
metadata:
name: thanos-ruler-custom-rules
namespace: open-cluster-management-observability
Example service-now webhook
global:
resolve_timeout: "5m"
receivers:
- name: "null"
- name: "service-now"
webhook_configs:
- url: "https://alert-manager-to-service-now-alert-manager-to-service-now.apps.ocp4.example.com/alerts"
route:
group_by:
- "namespace"
group_interval: "5m"
group_wait: "30s"
receiver: "null"
repeat_interval: "12h"
routes:
- match:
alertname: "Watchdog"
receiver: "null"
- match:
receiver: "service-now"
See the documentation in the repo for more details on the YAML configs and deployment instructions