StackStorm/st2contrib

Nagios handler posts to st2 but fails

vishnu81 opened this issue · 14 comments

Had executed python st2service_handler.py st2service_handler.yaml 123456 "Disk /var/log" WARNING 1 HARD 3 vm-vidr-004 --verbose from nagios host.
st2 receives the event but fails to raise the action.
Heres the st2rulesengine.log snippet


2016-08-11 14:32:52,989 140359568239344 ERROR filter [-] There might be a problem with critera in rule RuleDB(action=ActionExecutionSpecDB@140359567551696(ref="check_pgrep", parameters="{u'cmd': u'{{trigger.service}} 1 10', u'hosts': u'{{trigger.host}}'}"), criteria={u'state_type': {u'pattern': u'HARD', u'type': u'matchregex'}, u'service': {u'pattern': u'(.*)_check_(.*)_process', u'type': u'matchregex'}}, description="Check process state on host", enabled=True, id=57ac383c22399c60654dcb6e, name="check_proc", pack="nagios", ref="nagios.check_proc", tags=[], trigger="nagios.service_state_change", type=RuleTypeSpecDB@140359567550544(ref="standard", parameters="{}"), uid="rule:nagios:check_proc").
Traceback (most recent call last):
  File "/opt/stackstorm/st2/lib/python2.7/site-packages/st2reactor/rules/filter.py", line 133, in _check_criterion
    result = op_func(value=payload_value, criteria_pattern=criteria_pattern)
  File "/opt/stackstorm/st2/lib/python2.7/site-packages/st2common/operators.py", line 130, in match_regex
    return regex.match(value) is not None
TypeError: expected string or buffer (_trigger_instance={'status': 'processing', 'occurrence_time': '2016-08-11 18:32:52.887270+00:00', 'trigger': u'nagios.service_state_change', 'id': '57acc4d422399c7c8ab95bcf', 'payload': {'attempt': '3', 'service': 'Disk /var/log', 'event_id': '123456', 'host': 'vm-vidr-004', 'state_type': 'HARD', 'state': 'WARNING', 'msg': 'We gots a warning yo!', 'state_id': '1'}},_trigger={'uid': u'trigger:nagios:service_state_change:5f02f0889301fd7be1ac972c11bf3e7d', 'parameters': {}, 'ref_count': 0, 'name': u'service_state_change', 'pack': u'nagios', 'type': u'nagios.service_state_change', 'id': '57ac36b922399c5f741e95ed', 'description': None},_rule={'description': u'Check process state on host', 'tags': [], 'ref': u'nagios.check_proc', 'enabled': True, 'name': u'check_proc', 'trigger': u'nagios.service_state_change', 'criteria': {u'state_type': {u'pattern': u'HARD', u'type': u'matchregex'}, u'service': {u'pattern': u'(.*)_check_(.*)_process', u'type': u'matchregex'}}, 'action': 'ActionExecutionSpecDB@140359567551696(ref="check_pgrep", parameters="{u\'cmd\': u\'{{trigger.service}} 1 10\', u\'hosts\': u\'{{trigger.host}}\'}")', 'pack': u'nagios', 'type': 'RuleTypeSpecDB@140359567550544(ref="standard", parameters="{}")', 'id': '57ac383c22399c60654dcb6e', 'uid': u'rule:nagios:check_proc'})
2016-08-11 14:32:52,996 140359568239344 ERROR worker [-] Failed to handle trigger_instance TriggerInstanceDB(id=57acc4d422399c7c8ab95bcf, occurrence_time="2016-08-11 18:32:52.887270+00:00", payload={'attempt': '3', 'service': 'Disk /var/log', 'event_id': '123456', 'state': 'WARNING', 'state_type': 'HARD', 'host': 'vm-vidr-004', 'msg': 'We gots a warning yo!', 'state_id': '1'}, status="processing_failed", trigger="nagios.service_state_change").
Traceback (most recent call last):
  File "/opt/stackstorm/st2/lib/python2.7/site-packages/st2reactor/rules/worker.py", line 84, in process
    self.rules_engine.handle_trigger_instance(trigger_instance)
  File "/opt/stackstorm/st2/lib/python2.7/site-packages/st2reactor/rules/engine.py", line 28, in handle_trigger_instance
    matching_rules = self.get_matching_rules_for_trigger(trigger_instance)
  File "/opt/stackstorm/st2/lib/python2.7/site-packages/st2reactor/rules/engine.py", line 45, in get_matching_rules_for_trigger
    matching_rules = matcher.get_matching_rules()
  File "/opt/stackstorm/st2/lib/python2.7/site-packages/st2reactor/rules/matcher.py", line 38, in get_matching_rules
    matched_rules = [rule_filter.rule for rule_filter in rule_filters if rule_filter.filter()]
  File "/opt/stackstorm/st2/lib/python2.7/site-packages/st2reactor/rules/filter.py", line 82, in filter
    criterion_k, criterion_v, payload_lookup)
TypeError: 'bool' object is not iterable

Can we look at the rule that's been written up. Looks like that's what is creating the issue

Did you mean the nagios rule?
I used the default one from pack only...

Nagios pack has 6 rules defined, the test command does not execute any of them.
Am i doing anythin wrong??

Correct me if I am wrong: The rule you were trying to trigger is: nagios_service_proc.yaml

I think there issue that the action doesnt have a pack name. You might want to update it to something like pack.action. I think the action currently is just a sample: check_pgrep

Once you fix it, it should work. I tried looking up the actions that are part of the linux pack but check_pgrep isnt there.

These rules are a bit outdated. Even then, they are merely here as a guide to the users to be able to create their own rules. I would suggest taking them as a starting point to build rules based on your own Nagios checks.

Oh ok.
Heres one which i had edited.
This is not getting raised at all...

---
name: notify_chat
pack: nagios
description: Post to chat when nagios service state changes
enabled: true
trigger:
  type: nagios.service_state_change
criteria:
  trigger.attempt:
    pattern: 2
    type: gt
action:
  ref: chatops.post_message
  parameters:
    message: NAGIOS {{trigger.service}} ID:{{trigger.event_id}} STATE:{{trigger.state}}/{{trigger.state_type}}
      {{trigger.msg}}
    channel: 'Remediate'

@vishnu81 Sign up for our community and discuss there if you can
https://stackstorm.com/community-signup.

Also, did you setup chatops already?

I tried signing up there, the typeform kept asking for new team.

yup i did setup it, i have a working spark room.
Manual run of post_message action works.

Use this to debug the rule issue https://docs.stackstorm.com/troubleshooting/rules.html.

I just signed up for community and I don't see the issue you're seeing. I'll check our automation logs. If you are trying to login to slack and it is asking for a team, use "stackstorm_community".

error: st2.actions.python.SendInviteAction: ERROR    Failed to send invite to vidr@<REDACTED>: {"ok":false,"error":"already_invited"}                               (status code: 200)
None
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/st2actions/runners/python_action_wrapper.py", line 116, in <module>
    obj.run()
  File "/usr/lib/python2.7/dist-packages/st2actions/runners/python_action_wrapper.py", line 61, in run
    output = action.run(**self._parameters)
  File "/opt/stackstorm/packs/slack/actions/send_invite.py", line 45, in run
    raise Exception(failure_reason)
Exception: Failed to send invite to vidr@<REDACTED>: {"ok":false,"error":"already_invited"}

So you are already invited. Please check your email.

i get the thumbsup icon and the screen stays there...
do i need to have stackstorm.com email-id??

Once you finish signing up in Typeform, you have to download the slack app and login with your username/password there. Typeform is just used for registration. Next time, please be specific and include screenshots.

oh, so theres no way to login to stackstorm-community without installing slack on local machine?

Is there a way to access spark channel without installing spark on a local machine? ;)

There is web UI for slack luckily https://slack.com/signin.

Thanks lakshmi, resolved it at slack...
Had to disable the skeletal rule for check-proc