Slack/Teams/PagerDuty messages not working
Closed this issue · 7 comments
Somehow, any of the providers above get any messages from flux. Alerts were created similar to the docs:
apiVersion: notification.toolkit.fluxcd.io/v1beta3
kind: Alert
metadata:
name: test
namespace: flux-system
spec:
summary: Testing notification
providerRef:
name: provider
eventSources:
- kind: GitRepository
name: '*'
- kind: Kustomization
name: '*'
Providers too:
---
apiVersion: notification.toolkit.fluxcd.io/v1beta3
kind: Provider
metadata:
name: slack-bot
namespace: flux-system
spec:
type: slack
channel: alerts
address: https://slack.com/api/chat.postMessage
secretRef:
name: slack-bot-token
---
apiVersion: notification.toolkit.fluxcd.io/v1beta3
kind: Provider
metadata:
name: slack-webhook
namespace: flux-system
spec:
type: slack
channel: alerts
secretRef:
name: slack-webhook
---
apiVersion: notification.toolkit.fluxcd.io/v1beta3
kind: Provider
metadata:
name: pagerduty
namespace: flux-system
spec:
type: pagerduty
channel: R...
address: https://events.pagerduty.com
---
apiVersion: notification.toolkit.fluxcd.io/v1beta3
kind: Provider
metadata:
name: teams
namespace: flux-system
spec:
type: msteams
secretRef:
name: teams-webhook
---
apiVersion: notification.toolkit.fluxcd.io/v1beta3
kind: Provider
metadata:
name: generic
namespace: flux-system
spec:
type: generic
address: https://...ngrok-free.app
Removed the severity so I could test any message coming through, but it hasn't worked. Only message that worked was with the generic provider (I ran locally a Python http server through ngrok). Even tried to update the Slack provider to use that ngrok URL as its address, but nothing reached the service.
Tested with wget
from within the container, all endpoints were reached, so it's not a NetworkPolicy nor EC2 Security Group blocking anything (and would happen with ngrok if that was the case).
Any ideas on what is going on?
Tested in two different EKS clusters, versions 1.2.2 and 1.3.0, same behaviour happened in both.
I have a similar issue. I'm trying to use the Slack provider with the incoming webhook configuration. It looks like events are getting dispatched correctly but I don't see anything inside Slack. For example, from the notification controller logs:
{"level":"info","ts":"2024-07-12T15:40:49.493Z","logger":"event-server","msg":"dispatching event","eventInvolvedObject":{"kind":"Kustomization","namespace":"default","name":"nfs-provisioner","uid":"d8281d48-e3ee-4654-8238-50adb30ce3ec","apiVersion":"kustomize.toolkit.fluxcd.io/v1","resourceVersion":"18642991"},"message":"Reconciliation finished in 194.613846ms, next run in 1h0m0s"}
@comminutus that event is not meant to reach Slack, you only get events when something changed in the cluster.
@stefanprodan that's strange, because my Alert
looks like this:
apiVersion: notification.toolkit.fluxcd.io/v1beta3
kind: Alert
metadata:
name: all-alerts
namespace: default
spec:
providerRef:
name: slack
eventSources:
- kind: HelmRelease
name: '*'
- kind: Kustomization
name: '*'
If I don't create this Alert
in the cluster, then the message I get from the notification controller is, discarding event, no alerts found for the involved object
. When I add the alert, I get dispatching event
.
If the event doesn't go to slack, then what does "dispatching event" mean?
If the event doesn't go to slack, then what does "dispatching event" mean?
That particular event is for Git commit status updates, it doesn't get routed to Slack as it would create a massive SPAM. Change something in your manifests in Git that will trigger a change in the cluster, e.g. bump the replicas of some deployment, and the notification should show up in Slack.
@stefanprodan , ok I deleted a kustomization, and reconciled the parent kustomization. This appeared in the notification controller log:
{"level":"info","ts":"2024-07-13T11:57:56.183Z","logger":"event-server","msg":"dispatching event","eventInvolvedObject":{"kind":"Kustomization","namespace":"default","name":"fresh-rss","uid":"2c897664-8ead-49d3-b8d1-e77188e4e863","apiVersion":"kustomize.toolkit.fluxcd.io/v1","resourceVersion":"19518738"},"message":"Secret/default/fresh-rss created\nService/default/fresh-rss created\nDeployment/default/fresh-rss created\nPersistentVolumeClaim/default/fresh-rss created\nDatabase/default/fresh-rss created\nGrant/default/fresh-rss created\nUser/default/fresh-rss created\nIngress/default/fresh-rss created"}
I still don't get any message in Slack.
Also, it looks like the only place in the code where "dispatching event" exists is here:
, which looks like it calls thedispatchNotification
function on the EventServer
. I don't see where it would be doing any filtering.@comminutus I have just made a test here and I have a Slack Incoming Webhook provider working in production right now with the latest version of Flux, the configuration looks like this:
apiVersion: notification.toolkit.fluxcd.io/v1beta3
kind: Alert
metadata:
name: alert
namespace: flux-system
spec:
eventSeverity: error
eventSources:
- kind: GitRepository
name: '*'
- kind: Kustomization
name: '*'
providerRef:
name: slack
---
apiVersion: notification.toolkit.fluxcd.io/v1beta3
kind: Provider
metadata:
name: slack
namespace: flux-system
spec:
channel: flux-releases
secretRef:
name: slack-url
type: slack
---
apiVersion: v1
kind: Secret
metadata:
name: slack-url
namespace: flux-system
type: Opaque
stringData:
address: https://hooks.slack.com/services/xxxxxxx/xxxxxxx/xxxxxx
Do you have a URL that looks like the one above? If you send a correct payload to that URL does it work?
@matheuscscp Thanks, I hadn't thought of just testing the webhook url. When I tried it with curl
it couldn't resolve hooks.slack.com. Same with getent hosts ...
. Then I realized NextDNS was blocking it 😬 👎 . Sorry for the false alarm!