patricegautier/unifiZabbix

mca-dump-short script issue

commando768594 opened this issue · 9 comments

Hi,
Thanks for the hard working on this template, it is great one.
I have a Zabbix 6.0LTS environment to monitor Unifi APs, I installed this template without any problem. But the mca-dump-short.sh script sometime will give the error.
Here for examples run this script manually,

$ /usr/lib/zabbix/externalscripts/mca-dump-short.sh -d AP_address -u admin -i /home/zabbix/.ssh/zabbix/ -t AP -o 15

{ "at":"14:24:59", "r":"validationError: .vap_table? != null and .radio_table != null and ( .radio_table | map(select(.athstats.cu_total>0)) | length>0 ) ", "device":"AP_address", "mcaDumpError":"Error" }

sometime, it returns the error immediately, sometime it returns the correct long values.

I checked Zabbix server log, there are lot error to all APs.

4087107:20230328:142756.565 item "AP_address:mca-dump-short.sh["-d","{HOST.CONN}", "-P", "{$UNIFI_SSH_PORT}", "-u", "{$UNIFI_USER}", "-i", "{$UNIFI_SSH_PRIV_KEY_PATH}", "-t", "AP", "-p", "{$UNIFI_SSHPASS_PASSWORD_PATH}", "-o", "{$UNIFI_CHECK_TIMEOUT}","-b"]" became not supported: Preprocessing failed for: { "at":"14:27:55", "r":"validationError: .vap_table? != null and .radio_table != null and ( .radi...

4087107:20230328:143255.908 error reason for "AP_address:mca-dump-short.sh["-d","{HOST.CONN}", "-P", "{$UNIFI_SSH_PORT}", "-u", "{$UNIFI_USER}", "-i", "{$UNIFI_SSH_PRIV_KEY_PATH}", "-t", "AP", "-p", "{$UNIFI_SSHPASS_PASSWORD_PATH}", "-o", "{$UNIFI_CHECK_TIMEOUT}","-b"]" changed: Preprocessing failed for: { "at":"14:32:55", "r":"validationError: .vap_table? != null and .radio_table != null and ( .radi...

Please advice, where is the problem.

Thanks!

What kind of AP is it and which fw?

Hi,
I am using
UAP-AC-Pro, v6.2.35
UAP-AC-HD, v6.2.39

Like I said, it can return correct/long value sometime, this problem happens to all APs during working time.

does it get any better if you give it a bigger timeout value for -o, say 30?

does it get any better if you give it a bigger timeout value for -o, say 30?

Hi,
No, it is same if I set the timeout value to 30s, it is the max one for Zabbix poller.
In fact, I will get the error with 2 secs when I tried it from zabbix server manually.

Actually I just see your comment that it returns the error immediately.. I think this could mean the AP is too busy to collect values and report. Anything interesting on your zabbix server in /tmp/mcaDumpShort.log or .err?

Also, are you on the latest version on the templates? I did put some effort into catching the errors into zabbix and not polluting the logs at least..

g interest

Oh, I have a big file, mcaDumpShort.err there. File size already >1G.
When I found out the template new version, I deleted old one in Zabbix server, and imported it again. I think I can try again.

Here is few/newest error info in the file, I think I find the strange line, why did it say "Unable to resolve (http://unifi:8080/inform)"?

=================================================

Wed 29 Mar 2023 02:20:36 PM CST AP1
{ "at":"14:20:36", "r":"validationError: .vap_table? != null and .radio_table != null and ( .radio_table | map(select(.athstats.cu_total>0)) | length>0 ) ", "device":"AP1", "mcaDumpError":"Error" }\n {
"anon_id": "dce03194-2ecf-44ac-8455-90cc7ee42d8f",
"antenna_table": [
{
"default": true,
"id": 4,
"name": "Combined",
"wifi0_gain": 3,
"wifi1_gain": 3
}
],
"architecture": "mips",
"board_rev": 26,
"bootid": 1,
"bootrom_version": "unifi-v2.0.9.307-gd60f380a",
"cfgversion": "943ee365b1081122",
"connect_request_ip": "x.x.x.x",
"connect_request_port": "50255",
"countrycode_table": [],
"default": false,
"discovery_response": false,
"dualboot": true,
"ever_crash": false,
"guest_kicks": 0,
"guest_token": "2CE5C3169301D9BA36A4ED6287ABFCD6",
"hash_id": "c45590cc7ee42d8f",
"hostname": "AP1",
"inform_as_notif": true,
"inform_min_interval": 5,
"inform_url": "http://my_unifi_server_ip:8080/inform",
"ip": "x.x.x.x",
"isolated": false,
"kernel_version": "4.4.153",
"last_error_conns": [
{
"error_reason": 1,
"last_error_str": "Unable to resolve (http://unifi:8080/inform)",
"last_managed_e_time": 558556,
"last_managed_s_time": 558088,
"timestamp": "Wed Mar 29 10:30:20 2023"
},
{
"error_reason": 4,
"last_error_str": "Timeout (http://my_unifi_server_ip:8080/inform)",
"last_managed_e_time": 558556,
"last_managed_s_time": 558607,
"timestamp": "Wed Mar 29 10:38:31 2023"
},
{
"error_reason": 4,
"last_error_str": "Timeout (http://my_unifi_server_ip:8080/inform)",
"last_managed_e_time": 559164,
"last_managed_s_time": 558607,
"timestamp": "Wed Mar 29 10:39:48 2023"
},
{
"error_reason": 1,
"last_error_str": "Unable to resolve (http://unifi:8080/inform)",
"last_managed_e_time": 559164,
"last_managed_s_time": 558607,
"timestamp": "Wed Mar 29 10:40:29 2023"
},
{
"error_reason": 4,
"last_error_str": "Timeout (http://my_unifi_server_ip:8080/inform)",
"last_managed_e_time": 559164,
"last_managed_s_time": 559241,
"timestamp": "Wed Mar 29 10:44:05 2023"
},
{
"error_reason": 4,
"last_error_str": "Timeout (http://my_unifi_server_ip:8080/inform)",
"last_managed_e_time": 559496,
"last_managed_s_time": 559241,
"timestamp": "Wed Mar 29 10:45:19 2023"
},
{
"error_reason": 1,
"last_error_str": "Unable to resolve (http://unifi:8080/inform)",
"last_managed_e_time": 559496,
"last_managed_s_time": 559241,
"timestamp": "Wed Mar 29 10:46:00 2023"
},
{
"error_reason": 4,
"last_error_str": "Timeout (http://my_unifi_server_ip:8080/inform)",
"last_managed_e_time": 559496,
"last_managed_s_time": 559241,
"timestamp": "Wed Mar 29 10:46:36 2023"
},
{
"error_reason": 1,
"last_error_str": "Unable to resolve (http://unifi:8080/inform)",
"last_managed_e_time": 559496,
"last_managed_s_time": 559241,
"timestamp": "Wed Mar 29 10:46:41 2023"
},
{
"error_reason": 4,
"last_error_str": "Timeout (http://my_unifi_server_ip:8080/inform)",
"last_managed_e_time": 559496,
"last_managed_s_time": 559585,
"timestamp": "Wed Mar 29 13:32:47 2023"
}
],
"locating": false,
"mac": "fc:ec:da:f0:bc:2a",
"manufacturer_id": 4,
"model": "U7PG2",
"model_display": "UAP-AC-Pro-Gen2",
"netmask": "x.x.x.x",
"notif_payload": {
"assoc_status": "0", "auth_ts": "572391.735361",
"event_id": "1",
"event_type": "failure",
"mac": "ee:cd:b0:77:x:x",
"message_type": "STA_ASSOC_TRACKER",
"vap": "ath3"
},
"notif_reason": "event",
"required_version": "3.4.1",
"selfrun_beacon": true,
"serial": "FCECDAF0BC2A",
"state": 2,
"sys_error_caps": 0,
"time": 1680070835,
"uptime": 572413,
"version": "6.2.35.14043"

Hi,
The AP connected to the controller correctly, in the .err file, there are something about my correct controller address..
But I found where is "http://unifi", It comes from the mca-dump cmd output.

I will check how to fix this. :)

"last_error_str": "Unable to resolve (http://unifi:8080/inform)",