drewkerrigan/nagios-http-json

OK results do not show Performance Data

Jhamin opened this issue · 12 comments

When a service is in a warning or critical state the plugin returns the value that generated the alert in the Nagios Web GUI as well as in performance data. However when a service is in a good state the web interface just shows "OK: Status OK." (although Performance Data is populated as expected)

It would be very useful for those of us who use the WebGUI for system status displays if we could get the value that resulted in an "OK" to display, not just when Warning or Critical

Hi, sounds like a good idea. A PR is welcome, otherwise I'll have a look at it

Hey,

so the current mode is, that you have to specify the metrics you want with -m

For example:

check_http_json.py --host example.com --ssl --port 443 --path /health -k -Q db,OK
OK: Status OK.
check_http_json.py --host example.com --ssl --port 443 --path /health -k -Q db,OK -m db
OK: Status OK.|'db'=OK

Is that what you need?

Not quite. I understand specifying metrics. I'm having problems with the way the output format gets displayed in Nagios.

If I have a JSON that is returning "System.Uptime" as 12911, then I run some commands against it, I get the following:

./check_http_json.py -H testURL -p json -m "System.Uptime"
returns:
OK: Status OK.|'System.Uptime'=12911

If I add a warning condition to it that doesn't trigger a warning
./check_http_json.py -H testURL -p json -m System.Uptime -w System.Uptime,~:15000
Then I get the same result:
OK: Status OK.|'System.Uptime'=12911
(Before the "|" we just have OK: Status OK)

If I add a warning condition that does trigger
./check_http_json.py -H testURL -p json -m System.Uptime -w System.Uptime,~:10000
Then I get the following:
WARNING: Status WARNING. Value (12911) for key System.Uptime was greater than 10000.|'System.Uptime'=12911
(The important thing is that there are data values before and after the "|")

The problem is that when Nagios processes these, everything before the "|" gets displayed in the GUI and everything after the "|" gets recorded as Performance Data.
My concern is that while the performance data is vital for logging, if "OK: Status OK" is all that shows up in the GUI summery for the service which means that if you have this up on a monitor someone glancing at it can see the value is OK but can't see what the value is.

I'm hoping the output can be adjusted to return something like:
OK: 'System.Uptime'=12911|'System.Uptime'=12911

Ok I think I understand. Yeah that makes sense, other checks also show some additional details

root@4659771706e7:/# ./usr/lib/nagios/plugins/check_ping -H localhost -w 1,2% -c 3,4% -4
PING OK - Packet loss = 0%, RTA = 0.06 ms|rta=0.062000ms;1.000000;3.000000;0.000000 pl=0%;2;4;0


root@4659771706e7:/# ./usr/lib/nagios/plugins/check_icmp localhost   
OK - localhost: rta 0.042ms, lost 0%|rta=0.042ms;200.000;500.000;0; pl=0%;40;80;; rtmax=0.086ms;;;; rtmin=0.029ms;;;; 

I'll have a look at it.

Hello,
at least we can add the key or the alias in the status information like this.
OK: key[>alias] Status OK.
This development is included in milestone 2.0.
When is it planned?
Thank you

Hi, I was planning to work in this the next days. Next week or so the 2.0 version will be ready. You can help by testing the python3 branch

Hello, best I could come up with so far is this: https://github.com/drewkerrigan/nagios-http-json/tree/performance-data

Example output:

check_http_json.py -H localhost -P 8000 --path example.json -e 'status' -m status

OK: 'status'={'conn': 'ok', 'uptime': 123}  Status OK.|'status'={'conn': 'ok', 'uptime': 123}

But I'm not sure if this is the best solution. @drewkerrigan Maybe you have an idea on how to do this better?

@martialblog i think we can do a little better with the formatting before the pipe, I’ll add to your branch shortly

@drewkerrigan Jup, good idea. 👍

@martialblog I messed around with adding an additional summary value return from checkMetrics which just formats the metrics slightly differently, but honestly it's not worth it, your solution works fine, and is exactly what was asked for by @Jhamin - I say go for it.

@drewkerrigan Yeah I didn't see any other option, I did refactor the getMessage() function a bit though. We can still add a formating function later on.

@Jhamin , if you want you can test this branch with the new feature https://github.com/drewkerrigan/nagios-http-json/tree/performance-data

Is included in v2.0

See PR #57