ansible/event-driven-ansible

AlertManager Event Source Plugin fails with `405: Method Not Allowed`

Daniel-Vaz opened this issue · 2 comments

Issue

When using the AlertManager Event Source Plugin, Rulebooks Actions are never executed. I get the following logs on AlertManager:

ts=2023-08-22T11:11:34.420Z caller=dispatch.go:352 level=error component=dispatcher msg="Notify for alerts failed" num_alerts=1 err="ansible-awx/testing-ansible-eda/edaAlertmanager/webhook[0]: notify retry canceled due to unrecoverable error after 1 attempts: unexpected status code 405: http://activation-job-33-8000.ansible-awx:8000/: 405: Method Not Allowed"

Details

  • We are using the latest version of the eda-server-operator using also the latest version of the eda-ui and eda-server images.
  • K8s Cluster on v1.25.9.
  • Decision Environment using the image quay.io/ansible/ansible-rulebook:latest
  • We have Prometheus Operator in the cluster so we use the following Alertmanager configuration to set the EDA Alert Receiver (we are using Alertmanager version 0.25.0):
apiVersion: monitoring.coreos.com/v1alpha1
kind: AlertmanagerConfig
metadata:
  name: testing-ansible-eda
  namespace: ansible-awx
spec:
  route:
    groupBy: ['job']
    groupInterval: 1m
    repeatInterval: 2m
    receiver: 'ParentRoute'
    routes:
    - receiver: 'edaAlertmanager'
      continue: true
      matchers:
      - name: edaAlertmanager
        matchType: "=" 
        value: "true"
  receivers:
  - name: ParentRoute
  - name: edaAlertmanager
    webhookConfigs:
    - url: "http://activation-job-33-8000.ansible-awx:8000/"
      sendResolved: true
  • As can be seen above, we use as the AlertManager Route\Receiver edaAlertmanager sending alerts to the activation job Service Name\Namespace\Port as generated by the operator.
  • The Activation Pod for this Job (the target of the above referenced Service) reports the following logs:
2023-08-22 10:34:42,409 - ansible_rulebook.app - INFO - Starting worker mode
2023-08-22 10:34:42,410 - ansible_rulebook.websocket - INFO - websocket ws://eda-daphne:8001/api/eda/ws/ansible-rulebook connecting
2023-08-22 10:34:42,427 - ansible_rulebook.websocket - INFO - websocket ws://eda-daphne:8001/api/eda/ws/ansible-rulebook connected
2023-08-22 10:34:42,501 - ansible_rulebook.job_template_runner - INFO - Attempting to connect to Controller https://ansible-awx.platform-lab.internal.epo.org
2023-08-22 10:34:42,721 - ansible_rulebook.app - INFO - AAP Version 22.7.0
2023-08-22 10:34:42,722 - ansible_rulebook.app - INFO - Starting sources
2023-08-22 10:34:42,722 - ansible_rulebook.app - INFO - Starting rules
2023-08-22 10:34:42,722 - ansible_rulebook.engine - INFO - run_ruleset
2023-08-22 10:34:42,722 - drools.ruleset - INFO - Using jar: /opt/app-root/lib/python3.9/site-packages/drools/jars/drools-ansible-rulebook-integration-runtime-1.0.3-SNAPSHOT.jar
2023-08-22 10:34:43 224 [main] INFO org.drools.ansible.rulebook.integration.api.rulesengine.AbstractRulesEvaluator - Start automatic pseudo clock with a tick every 100 milliseconds
2023-08-22 10:34:43,227 - ansible_rulebook.engine - INFO - ruleset define: {"name": "Example AlertManager Alerts RuleBook", "hosts": ["all"], "sources": [{"EventSource": {"name": "listen for alerts", "source_name": "ansible.eda.alertmanager", "source_args": {"host": "0.0.0.0", "port": 8000}, "source_filters": []}}], "rules": [{"Rule": {"name": "Example AlertManager Alerts", "condition": {"AllCondition": [{"IsDefinedExpression": {"Event": "meta"}}]}, "actions": [{"Action": {"action": "run_job_template", "action_args": {"name": "Sleep", "organization": "org-EPO-operations"}}}], "enabled": true}}]}
2023-08-22 10:34:43,235 - ansible_rulebook.engine - INFO - load source
2023-08-22 10:34:43,622 - ansible_rulebook.engine - INFO - load source filters
2023-08-22 10:34:43,623 - ansible_rulebook.engine - INFO - loading eda.builtin.insert_meta_info
2023-08-22 10:34:43,994 - ansible_rulebook.engine - INFO - Calling main in ansible.eda.alertmanager
2023-08-22 10:34:43,995 - ansible_rulebook.websocket - INFO - feedback websocket ws://eda-daphne:8001/api/eda/ws/ansible-rulebook connecting
2023-08-22 10:34:43,997 - ansible_rulebook.engine - INFO - Waiting for all ruleset tasks to end
2023-08-22 10:34:43 997 [drools-async-evaluator-thread] INFO org.drools.ansible.rulebook.integration.api.io.RuleExecutorChannel - Async channel connected
2023-08-22 10:34:44,013 - ansible_rulebook.rule_set_runner - INFO - Waiting for actions on events from Example AlertManager Alerts RuleBook
2023-08-22 10:34:44,013 - ansible_rulebook.rule_set_runner - INFO - Waiting for events, ruleset: Example AlertManager Alerts RuleBook
2023-08-22 10:34:44,015 - ansible_rulebook.websocket - INFO - feedback websocket ws://eda-daphne:8001/api/eda/ws/ansible-rulebook connected
2023-08-22 10:35:07,925 - aiohttp.access - INFO - 172.17.133.131 [22/Aug/2023:10:35:07 +0000] "POST / HTTP/1.1" 405 207 "-" "Alertmanager/0.25.0"
2023-08-22 10:36:01,237 - aiohttp.access - INFO - 172.17.133.131 [22/Aug/2023:10:36:01 +0000] "POST / HTTP/1.1" 405 207 "-" "Alertmanager/0.25.0"
2023-08-22 11:07:34,421 - aiohttp.access - INFO - 172.17.188.46 [22/Aug/2023:11:07:34 +0000] "POST / HTTP/1.1" 405 207 "-" "Alertmanager/0.25.0"
2023-08-22 11:08:34,418 - aiohttp.access - INFO - 172.17.188.46 [22/Aug/2023:11:08:34 +0000] "POST / HTTP/1.1" 405 207 "-" "Alertmanager/0.25.0"
2023-08-22 11:09:34,419 - aiohttp.access - INFO - 172.17.188.46 [22/Aug/2023:11:09:34 +0000] "POST / HTTP/1.1" 405 207 "-" "Alertmanager/0.25.0"
2023-08-22 11:10:34,420 - aiohttp.access - INFO - 172.17.188.46 [22/Aug/2023:11:10:34 +0000] "POST / HTTP/1.1" 405 207 "-" "Alertmanager/0.25.0"
2023-08-22 11:11:34,420 - aiohttp.access - INFO - 172.17.188.46 [22/Aug/2023:11:11:34 +0000] "POST / HTTP/1.1" 405 207 "-" "Alertmanager/0.25.0"
  • As can be seen the AlertManager Alert gets sent to the EDA Activation Pod, but there we return a 405 response complaining about the method being used: 405: Method Not Allowed.

Questions

  • Could this be related with the way Alertmanager is configured and sending Alerts ?
  • Is AlertManager Event Source Plugin that needs to be updated to better handle these alerts being sent ?

Any help would be greatly appreciated thank you in advance.

If I use the webhook Event Source Plugin, I just get a 200 Response in the logs, but nothing else happens:

2023-08-22 09:54:15,275 - ansible_rulebook.app - INFO - Starting worker mode
2023-08-22 09:54:15,275 - ansible_rulebook.websocket - INFO - websocket ws://eda-daphne:8001/api/eda/ws/ansible-rulebook connecting
2023-08-22 09:54:15,291 - ansible_rulebook.websocket - INFO - websocket ws://eda-daphne:8001/api/eda/ws/ansible-rulebook connected
2023-08-22 09:54:15,361 - ansible_rulebook.app - INFO - Starting sources
2023-08-22 09:54:15,361 - ansible_rulebook.app - INFO - Starting rules
2023-08-22 09:54:15,361 - ansible_rulebook.engine - INFO - run_ruleset
2023-08-22 09:54:15,361 - drools.ruleset - INFO - Using jar: /opt/app-root/lib/python3.9/site-packages/drools/jars/drools-ansible-rulebook-integration-runtime-1.0.3-SNAPSHOT.jar
2023-08-22 09:54:15 863 [main] INFO org.drools.ansible.rulebook.integration.api.rulesengine.AbstractRulesEvaluator - Start automatic pseudo clock with a tick every 100 milliseconds
2023-08-22 09:54:15,867 - ansible_rulebook.engine - INFO - ruleset define: {"name": "Listen for events on a webhook", "hosts": ["all"], "sources": [{"EventSource": {"name": "ansible.eda.webhook", "source_name": "ansible.eda.webhook", "source_args": {"host": "0.0.0.0", "port": 5000}, "source_filters": []}}], "rules": [{"Rule": {"name": "Say Hello", "condition": {"AllCondition": [{"IsDefinedExpression": {"Event": "msg"}}]}, "actions": [{"Action": {"action": "print_event", "action_args": {"pretty": true}}}], "enabled": true}}]}
2023-08-22 09:54:15,874 - ansible_rulebook.engine - INFO - load source
2023-08-22 09:54:16,292 - ansible_rulebook.engine - INFO - load source filters
2023-08-22 09:54:16,292 - ansible_rulebook.engine - INFO - loading eda.builtin.insert_meta_info
2023-08-22 09:54:16,670 - ansible_rulebook.engine - INFO - Calling main in ansible.eda.webhook
2023-08-22 09:54:16,672 - ansible_rulebook.websocket - INFO - feedback websocket ws://eda-daphne:8001/api/eda/ws/ansible-rulebook connecting
2023-08-22 09:54:16,673 - ansible_rulebook.engine - INFO - Waiting for all ruleset tasks to end
2023-08-22 09:54:16 673 [drools-async-evaluator-thread] INFO org.drools.ansible.rulebook.integration.api.io.RuleExecutorChannel - Async channel connected
2023-08-22 09:54:16,689 - ansible_rulebook.rule_set_runner - INFO - Waiting for actions on events from Listen for events on a webhook
2023-08-22 09:54:16,689 - ansible_rulebook.rule_set_runner - INFO - Waiting for events, ruleset: Listen for events on a webhook
2023-08-22 09:54:16,692 - ansible_rulebook.websocket - INFO - feedback websocket ws://eda-daphne:8001/api/eda/ws/ansible-rulebook connected
2023-08-22 09:55:02,236 - aiohttp.access - INFO - 172.17.133.131 [22/Aug/2023:09:55:02 +0000] "POST / HTTP/1.1" 200 150 "-" "Alertmanager/0.25.0"
2023-08-22 09:55:13,993 - aiohttp.access - INFO - 172.17.163.17 [22/Aug/2023:09:55:13 +0000] "POST / HTTP/1.1" 200 150 "-" "Alertmanager/0.25.0"
2023-08-22 09:56:02,234 - aiohttp.access - INFO - 172.17.133.131 [22/Aug/2023:09:56:02 +0000] "POST / HTTP/1.1" 200 150 "-" "Alertmanager/0.25.0"
2023-08-22 09:56:13,991 - aiohttp.access - INFO - 172.17.163.17 [22/Aug/2023:09:56:13 +0000] "POST / HTTP/1.1" 200 150 "-" "Alertmanager/0.25.0"
2023-08-22 09:59:13,994 - aiohttp.access - INFO - 172.17.163.17 [22/Aug/2023:09:59:13 +0000] "POST / HTTP/1.1" 200 150 "-" "Alertmanager/0.25.0"
2023-08-22 10:00:13,994 - aiohttp.access - INFO - 172.17.163.17 [22/Aug/2023:10:00:13 +0000] "POST / HTTP/1.1" 200 150 "-" "Alertmanager/0.25.0"

Closing issue.
The test performed were wrong.

Basically was sending Alertmanager alerts to the "/" path of the EDA endpoint and not "/endpoint".