grafana/alerting

Review discrepancies between Grafana and Alertmanager contact points

yuri-tceretian opened this issue ยท 4 comments

This is part of the process of unification of Alertmanagers (Mimir and Grafana). Grafana notifiers are derived from the legacy alerting system and have different formats of configuration, use different ways of communicating with the APIs, etc. Therefore, to unify (or not) notifiers we need to review the discrepancies, document them, and then make decisions in each case.

Alertmanager notifiers:

Notifier Mimir Grafana
Alertmanager ๐Ÿ”ฒ โœ…
Ding Ding ๐Ÿ”ฒ โœ…
Discord โœ… โœ…
Email โœ… โœ… link
Google Chat ๐Ÿ”ฒ โœ…
Kafka ๐Ÿ”ฒ โœ…
LINE ๐Ÿ”ฒ โœ…
OPS Genie โœ… โœ… link
PagerDuty โœ… โœ… link
Pushover โœ… โœ…
Sensugo ๐Ÿ”ฒ โœ…
Slack โœ… โœ…
SNS โœ… ๐Ÿ”ฒ
MS Teams ๐Ÿ”ฒ โœ…
Telegram โœ… โœ… link
Threema ๐Ÿ”ฒ โœ…
VictorOps โœ… โœ…
Webex โœ… โœ…
Webhook โœ… โœ…
WeChat โœ… ๐Ÿ”ฒ
WeCom ๐Ÿ”ฒ โœ…

Global difference:

  • In Alertmanager there is a flag send_resolved whereas in Grafana it is called disableResolveMessage
  • Grafana notifiers support images.

Telegram

Alertmanager:

  • Uses telebot.v3 package to send notifications
  • set "disable web page preview" to true
Setting Type Requried Default
api_url URL Global telegram_api_url, https://api.telegram.org
bot_token Secret yes
chat_id int64 > 0
message string {{ template "telegram.default.message" . }}
disable_notifications bool false
parse_mode string: Markdown, MarkdownV2, HTML HTML

Grafana

  • Supports images
  • Uses Grafana webhook notifier
  • Api URL is hardcoded to "https://api.telegram.org/bot%s/%s" where first is bot-token and second is action (sendMessage, sendPhoto)
  • parse_mode is hardcoded to HTML
Setting Type Requried Default
bottoken encrypted yes
chat_id string yes
message string {{ template "default.message" . }}
disable_notifications* bool false
parse_mode* string: Markdown, MarkdownV2, HTML HTML

* - since Grafana 9.4.0

PagerDuty

Alertmanager

  • Uses V1 API if either service_key or service_key_file are specified. Otherwise, uses V2 API
  • If uses V1 API, retries on status 403
  • If uses V2 API, retries on status 429
  • Incident key of the V1 payload is a hash of the group key
  • routing_key can be a template!
Setting Type Requried Default
service_key Secret yes*
service_key_file string yes*
routing_key Secret yes*
routing_key_file string yes*
url URL https://events.pagerduty.com/v2/enqueue
client string {{ template "pagerduty.default.client" . }}
client_url string {{ template "pagerduty.default.clientURL" . }}
description string {{ template "pagerduty.default.description" .}}
details map[string]string **
images []PagerdutyImage
links []PagerdutyLink
source string same as client
severity string error
class string
component string
group string

* One of the fields must be specified.

** Default details

{
"firing":       "{{ template "pagerduty.default.instances" .Alerts.Firing }}",
"resolved":     "{{ template "pagerduty.default.instances" .Alerts.Resolved }}",
"num_firing":   "{{ .Alerts.Firing | len }}",
"num_resolved": "{{ .Alerts.Resolved | len }}"
}

PagerdutyImage:

Setting Type Requried Default
src string
alt string
href string

PagerdutyLink:

Setting Type Requried Default
href string
text string

Grafana

  • Supports only V2 API
  • Does not retry?
Setting Type Requried Default Equivalent in Alertmanager
integrationKey string yes routing_key
severity string "critical" severity
class string "default" class
component string "Grafana" component
group string "default" group
summary string {{ template "default.title" . }} description
source* string hostname and if error - client source
client* string "Grafana" client
client_url* string {{ .ExternalURL }} client_url

* - added in Grafana 9.4

  • client_details are hardcoded
{
    "firing":       `{{ template "__text_alert_list" .Alerts.Firing }}`,
    "resolved":     `{{ template "__text_alert_list" .Alerts.Resolved }}`,
    "num_firing":   `{{ .Alerts.Firing | len }}`,
    "num_resolved": `{{ .Alerts.Resolved | len }}`,
}
  • links hardcoded to Grafana URL
  • url is hardcoded to https://events.pagerduty.com/v2/enqueue

OpsGenie

Alertmanager

Setting Type Requried Default
api_key Secret yes*
api_key_file string yes*
api_url URL https://api.opsgenie.com/
message string {{ template "opsgenie.default.message" . }}
description string {{ template "opsgenie.default.description" . }}
source string {{ template "opsgenie.default.source" . }}
details map[string]string
entity string
responders []OpsGenieConfigResponder
actions string
tags string
note string
priority string
update_alerts bool

* - either option must be specified

OpsGenieConfigResponder

Setting Type Requried Default
id string yes*
name string yes*
username string yes*
type string yes

* - either option must be specified

  • type must match ^(team|teams|user|escalation|schedule)$ (case insensitive)

Grafana

Setting Type Requried Default Equivalent in Alertmanager
apiKey string api_key
apiUrl string https://api.opsgenie.com/v2/alerts api_url
message string message
description string description
autoClose bool
overridePriority bool
sendTagsAs string
  • source is hardcoded to "Grafana"
  • tags are hardcoded to alert group CommonLabels labels. Sent only if sendTagsAs tags or both. In AM tags are configured
  • labels values in details support templates (bug?)
  • autoClose is not supported in AM

Email

Summary:

Alertmanager

Setting Type Requried Default
to string yes
from string yes *
hello string * "localhost"
smarthost HostPort yes *
auth_username string *
auth_password Secret **
auth_password_file string **
auth_secret Secret *
auth_identity string *
headers map[string]string ***
html string {{ template "email.default.html" . }}
text string
require_tls bool * true
tls_config commoncfg.TLSConfig

* if setting is empty, it checks the global setting. The global setting defaults to empty value unless setting is specified in the table above.

** if both fields auth_password and auth_password_file are not specified defaults to the global setting

*** Default headers:

  • "Subject": {{ template "email.default.subject" . }}
  • "To" is config.To
  • "From"is config.From
  • "Message-Id" = fmt.Sprintf("<%d.%d@%s>", time.Now().UnixNano(), rand.Uint64(), n.hostname) where n.hostname is os.Hostname and fallback to localhost.localdomain

Notable differences:

  • SendResolved is false
  • to and from should be parseable by ParseAddressList from net/mail which accepts only comma-separated

Grafana

Setting Type Requried Default Equivalent in Alertmanager
singleEmail bool false
addresses string true to *
message string
subject string {{ template "default.title" . }} header "Subject"

* acceptible format for addresses is more relaxed. Accepted a string that is delimited by any of the following characters (\n, ,, ;)

Notable differences:

  • Grafana Email receiver is a very different than Alertamanager's mostly because it re-uses Grafana infrastructure for sending emails.
  • The email body templating is two-stage: the first stage in the notifier expands the template of the message and subject using a template common for all notifiers. At this stage, the user is not allowed to render HTML code. The second stage embeds the expanded data in the HTML template.