Feature: Add possibility to monitor the state of single cluster nodes
K0nne opened this issue · 6 comments
Hello,
we whould love to monitor the state of single cluster nodes to react if a machine is entering paused state. Maybe activating a downtime via event-command. The functionality is already part of Invoke-IcingaCheckClusterHealth. Something like this could be helpful:
Invoke-IcingaCheckClusterNode -nodename %nodename% -state paused
[CRITICAL] Check package "Cluster Node" - [CRITICAL] State
\_ [CRITICAL] Check package "%nodename%"
\_ [CRITICAL] State: Value "Paused" is matching threshold "Paused"
What do you think about this?
Can you please test the linked PR if the feature is working properly and as expected?
Please note, we added the argument as array, which means you can add multiple includes/excludes to the list with wildcard filtering
This PR now includes all previous changes and the other commit. Testing this is like testing the release tomorrow.
I have tested the PR and it is working. I can filter for single hosts and the check goes critical if the host is paused.
But in my opinion the output of the "Cluster Resources" is unnessecary in this case, because we are only interested in the state of the given host. I want to use an eventcommand if this check goes critical. I fear that if "Cluster Resources" goes critical, it could lead to a false positive execution.
icinga { invoke-icingacheckclusterhealth -include %hostname% -verbosity 3 }
[CRITICAL] Cluster Services [CRITICAL] Cluster Nodes [WARNING] Cluster Resources (All must be [OK])
\_ [CRITICAL] Cluster Nodes (All must be [OK])
\_ [CRITICAL] %hostname% (All must be [OK])
\_ [CRITICAL] #9 State: Paused is matching threshold Paused
\_ [OK] #9 Status Information: 8
_ [WARNING] Cluster Resources (All must be [OK])
[lots of VMs]
Thank you for the input. Which input is added below? It could only be the general cluster resources for this cluster in my opinion, because nothing else it added.
It should be the general state of the cluster. If you cannot share it here on GitHub, can you please send the full output by mail?
I will send it to you tomorrow in the morning.
The new PR is looking good 🥇