bb-Ricardo/check_redfish

Optional filter for --mel or --sel log checks

log1-c opened this issue · 5 comments

Hi :)

Would it be possible to implement a filtering mechanic for the --mel or --sel checks to ignore some messages based on their content?

The reason why I am thinking of such an option is that I have a customer whose Dell systems (XC640-10 and XC740XD-24) are reporting the following message on a daily basis as CRITICAL (though the message is a warning state according to the notes):

[CRITICAL]: 2022-02-08T05:03:04-06:00: The iDRAC Service Module communication with iDRAC has ended.

I have found a release note document (https://dl.dell.com/topicspdf/idrac-service-module_release-notes11_en-us.pdf) where it is said that this is normal behavior.
image

Not sure if such a filter would blow up the check too much (runtime-wise) or if it is too complicated to add though :)
The customer will also check if updaing the firmware will change the behavior.

Cheers
log1c ✌️

Hi,

I think an exclude filter for log messages should be very possible and quite easy to implement. Would suggest following:

--log_exclude = "The iDRAC Service Module communication with iDRAC has ended"

This should be regex.

And for multiple matches:

--log_exclude = '"log message, with a comma", another log message, user .* logged in'

what do you think?

that sounds very good, I like it a lot!

Hi,

finally got some time to implement the filter. Can you checkout the next-release branch and test it out?

Thank you

looking good!

# ./check_redfish.py '--mel' ...
[CRITICAL]: 2022-03-04T09:48:35-06:00: The iDRAC Service Module communication with iDRAC has ended.
[CRITICAL]: 2022-03-04T09:36:13-06:00: The iDRAC Service Module communication with iDRAC has ended.
[WARNING]: 2022-03-03T09:13:19-06:00: The iDRAC Service Module communication with iDRAC has ended.
[WARNING]: 2022-03-02T15:40:15-06:00: The Integrated NIC 1 Port 1 network link is down.
[WARNING]: 2022-03-02T15:40:15-06:00: The Integrated NIC 1 Port 2 network link is down.
[WARNING]: 2022-03-02T15:40:12-06:00: The iDRAC Service Module communication with iDRAC has ended.
[WARNING]: 2022-03-02T08:16:53-06:00: The iDRAC Service Module communication with iDRAC has ended.

# ./check_redfish.py '--mel' ... --log_exclude "The iDRAC Service Module communication with iDRAC has ended"
[WARNING]: 2022-03-02T15:40:15-06:00: The Integrated NIC 1 Port 1 network link is down.
[WARNING]: 2022-03-02T15:40:15-06:00: The Integrated NIC 1 Port 2 network link is down.

# ./check_redfish.py '--mel' ... --log_exclude '"The iDRAC Service Module communication with iDRAC has ended", "network link is down"'
[OK]: Manager Event Log contains 2437 OK entries. Most recent notable: [OK]: 2022-03-07T10:00:13-06:00: Successfully logged in using PTAdmin, from 169.254.0.2 and WS-MAN.

Works when configured via the Icinga Director as well :)
Thanks a lot!

Thank you for testing. will close this issue.