wrong psu1_state,psu2_state
Closed this issue · 10 comments
geting 0 for this stats but in code i sea if ok then it must be 1
@smitas3400 can you pls provide terminal listing for this one
in promethus i get mktxp_system_psu1_state{instance="10.10.100.101:49090", job="mktxp", routerboard_address="ipadresa", routerboard_name="giraite_pusyno_spinta"} | 0
but if i understand correctly it must be 1 if psu state = "ok" and 0 if "fail"?
and what do you get in your router terminal, can you share the result of system/health/print detail
?
Also, use wget or a browser to access mktxp_ip_address:49090
and then check for / share the psu1_state metrics from there?
mikrotik terminal
0 name="sfp-temperature" value=49 type=C
1 name="switch-temperature" value=48 type=C
2 name="fan-state" value=ok type=""
3 name="fan1-speed" value=4080 type=RPM
4 name="fan2-speed" value=4125 type=RPM
5 name="fan3-speed" value=4125 type=RPM
6 name="psu1-state" value=ok type=""
7 name="psu2-state" value=ok type=""
in mktxp
HELP mktxp_system_psu1_state System PSU1 state
TYPE mktxp_system_psu1_state gauge
mktxp_system_psu1_state{routerboard_address="10.0.226.7",routerboard_name="giraite_pusyno_spinta"} 0.0
HELP mktxp_system_psu2_state System PSU2 state
TYPE mktxp_system_psu2_state gauge
mktxp_system_psu2_state{routerboard_address="10.0.226.7",routerboard_name="giraite_pusyno_spinta"} 0.0
HELP mktxp_system_switch_temperature Current switch temperature
TYPE mktxp_system_switch_temperature gauge
mktxp_system_switch_temperature{routerboard_address="10.0.226.7",routerboard_name="giraite_pusyno_spinta"} 48.0
HELP mktxp_system_psu1_state System PSU1 state
TYPE mktxp_system_psu1_state gauge
mktxp_system_psu1_state{routerboard_address="10.0.226.7",routerboard_name="giraite_pusyno_spinta"} 0.0
HELP mktxp_system_psu2_state System PSU2 state
TYPE mktxp_system_psu2_state gauge
mktxp_system_psu2_state{routerboard_address="10.0.226.7",routerboard_name="giraite_pusyno_spinta"} 0.0
HELP mktxp_system_psu1_state System PSU1 state
TYPE mktxp_system_psu1_state gauge
mktxp_system_psu1_state{routerboard_address="10.0.226.7",routerboard_name="giraite_pusyno_spinta"} 0.0
HELP mktxp_system_psu2_state System PSU2 state
TYPE mktxp_system_psu2_state gauge
mktxp_system_psu2_state{routerboard_address="10.0.226.7",routerboard_name="giraite_pusyno_spinta"} 0.0
HELP mktxp_system_fan_one_speed System fan 1 current speed
TYPE mktxp_system_fan_one_speed gauge
mktxp_system_fan_one_speed{routerboard_address="10.0.226.7",routerboard_name="giraite_pusyno_spinta"} 4050.0
HELP mktxp_system_psu1_state System PSU1 state
TYPE mktxp_system_psu1_state gauge
mktxp_system_psu1_state{routerboard_address="10.0.226.7",routerboard_name="giraite_pusyno_spinta"} 0.0
HELP mktxp_system_psu2_state System PSU2 state
TYPE mktxp_system_psu2_state gauge
mktxp_system_psu2_state{routerboard_address="10.0.226.7",routerboard_name="giraite_pusyno_spinta"} 0.0
HELP mktxp_system_fan_two_speed System fan 2 current speed
TYPE mktxp_system_fan_two_speed gauge
mktxp_system_fan_two_speed{routerboard_address="10.0.226.7",routerboard_name="giraite_pusyno_spinta"} 4125.0
HELP mktxp_system_psu1_state System PSU1 state
TYPE mktxp_system_psu1_state gauge
mktxp_system_psu1_state{routerboard_address="10.0.226.7",routerboard_name="giraite_pusyno_spinta"} 0.0
HELP mktxp_system_psu2_state System PSU2 state
TYPE mktxp_system_psu2_state gauge
mktxp_system_psu2_state{routerboard_address="10.0.226.7",routerboard_name="giraite_pusyno_spinta"} 0.0
HELP mktxp_system_fan_three_speed System fan 3 current speed
TYPE mktxp_system_fan_three_speed gauge
mktxp_system_fan_three_speed{routerboard_address="10.0.226.7",routerboard_name="giraite_pusyno_spinta"} 4125.0
HELP mktxp_system_psu1_state System PSU1 state
TYPE mktxp_system_psu1_state gauge
mktxp_system_psu1_state{routerboard_address="10.0.226.7",routerboard_name="giraite_pusyno_spinta"} 0.0
HELP mktxp_system_psu2_state System PSU2 state
TYPE mktxp_system_psu2_state gauge
mktxp_system_psu2_state{routerboard_address="10.0.226.7",routerboard_name="giraite_pusyno_spinta"} 0.0
HELP mktxp_system_psu1_state System PSU1 state
TYPE mktxp_system_psu1_state gauge
mktxp_system_psu1_state{routerboard_address="10.0.226.7",routerboard_name="giraite_pusyno_spinta"} 1.0
HELP mktxp_system_psu2_state System PSU2 state
TYPE mktxp_system_psu2_state gauge
mktxp_system_psu2_state{routerboard_address="10.0.226.7",routerboard_name="giraite_pusyno_spinta"} 0.0
HELP mktxp_system_psu1_state System PSU1 state
TYPE mktxp_system_psu1_state gauge
mktxp_system_psu1_state{routerboard_address="10.0.226.7",routerboard_name="giraite_pusyno_spinta"} 0.0
HELP mktxp_system_psu2_state System PSU2 state
TYPE mktxp_system_psu2_state gauge
mktxp_system_psu2_state{routerboard_address="10.0.226.7",routerboard_name="giraite_pusyno_spinta"} 1.0
just dont understant why ther is so much enterys for psu staste i some shows normal status others only 0
and now in promethus i started geting errore it startet then i lounched mktxp in my monitoring stack
name: PrometheusTargetScrapeDuplicate
expr: increase(prometheus_target_scrapes_sample_duplicate_timestamp_total[5m]) > 0
labels:
severity: warning
source: prometheus
annotations:
description: Prometheus has many samples rejected due to duplicate timestamps but different values
VALUE = {{ $value }}
LABELS = {{ $labels }}
summary: Prometheus target scrape duplicate (instance {{ $labels.instance }})
and ther is for all mikoriks multiple enterys for psu_state
can you try out now with the latest?
still same
# HELP mktxp_system_psu1_state System PSU1 state
# TYPE mktxp_system_psu1_state gauge
mktxp_system_psu1_state{routerboard_address="10.0.226.7",routerboard_name="giraite_pusyno_spinta"} 0.0
# HELP mktxp_system_psu2_state System PSU2 state
# TYPE mktxp_system_psu2_state gauge
mktxp_system_psu2_state{routerboard_address="10.0.226.7",routerboard_name="giraite_pusyno_spinta"} 0.0
# HELP mktxp_system_switch_temperature Current switch temperature
# TYPE mktxp_system_switch_temperature gauge
mktxp_system_switch_temperature{routerboard_address="10.0.226.7",routerboard_name="giraite_pusyno_spinta"} 47.0
# HELP mktxp_system_psu1_state System PSU1 state
# TYPE mktxp_system_psu1_state gauge
mktxp_system_psu1_state{routerboard_address="10.0.226.7",routerboard_name="giraite_pusyno_spinta"} 0.0
# HELP mktxp_system_psu2_state System PSU2 state
# TYPE mktxp_system_psu2_state gauge
mktxp_system_psu2_state{routerboard_address="10.0.226.7",routerboard_name="giraite_pusyno_spinta"} 0.0
# HELP mktxp_system_psu1_state System PSU1 state
# TYPE mktxp_system_psu1_state gauge
mktxp_system_psu1_state{routerboard_address="10.0.226.7",routerboard_name="giraite_pusyno_spinta"} 0.0
# HELP mktxp_system_psu2_state System PSU2 state
# TYPE mktxp_system_psu2_state gauge
mktxp_system_psu2_state{routerboard_address="10.0.226.7",routerboard_name="giraite_pusyno_spinta"} 0.0
# HELP mktxp_system_fan_one_speed System fan 1 current speed
# TYPE mktxp_system_fan_one_speed gauge
mktxp_system_fan_one_speed{routerboard_address="10.0.226.7",routerboard_name="giraite_pusyno_spinta"} 4065.0
# HELP mktxp_system_psu1_state System PSU1 state
# TYPE mktxp_system_psu1_state gauge
mktxp_system_psu1_state{routerboard_address="10.0.226.7",routerboard_name="giraite_pusyno_spinta"} 0.0
# HELP mktxp_system_psu2_state System PSU2 state
# TYPE mktxp_system_psu2_state gauge
mktxp_system_psu2_state{routerboard_address="10.0.226.7",routerboard_name="giraite_pusyno_spinta"} 0.0
# HELP mktxp_system_fan_two_speed System fan 2 current speed
# TYPE mktxp_system_fan_two_speed gauge
mktxp_system_fan_two_speed{routerboard_address="10.0.226.7",routerboard_name="giraite_pusyno_spinta"} 4125.0
# HELP mktxp_system_psu1_state System PSU1 state
# TYPE mktxp_system_psu1_state gauge
mktxp_system_psu1_state{routerboard_address="10.0.226.7",routerboard_name="giraite_pusyno_spinta"} 0.0
# HELP mktxp_system_psu2_state System PSU2 state
# TYPE mktxp_system_psu2_state gauge
mktxp_system_psu2_state{routerboard_address="10.0.226.7",routerboard_name="giraite_pusyno_spinta"} 0.0
# HELP mktxp_system_fan_three_speed System fan 3 current speed
# TYPE mktxp_system_fan_three_speed gauge
mktxp_system_fan_three_speed{routerboard_address="10.0.226.7",routerboard_name="giraite_pusyno_spinta"} 4110.0
# HELP mktxp_system_psu1_state System PSU1 state
# TYPE mktxp_system_psu1_state gauge
mktxp_system_psu1_state{routerboard_address="10.0.226.7",routerboard_name="giraite_pusyno_spinta"} 0.0
# HELP mktxp_system_psu2_state System PSU2 state
# TYPE mktxp_system_psu2_state gauge
mktxp_system_psu2_state{routerboard_address="10.0.226.7",routerboard_name="giraite_pusyno_spinta"} 0.0
# HELP mktxp_system_psu1_state System PSU1 state
# TYPE mktxp_system_psu1_state gauge
mktxp_system_psu1_state{routerboard_address="10.0.226.7",routerboard_name="giraite_pusyno_spinta"} 1.0
# HELP mktxp_system_psu2_state System PSU2 state
# TYPE mktxp_system_psu2_state gauge
mktxp_system_psu2_state{routerboard_address="10.0.226.7",routerboard_name="giraite_pusyno_spinta"} 0.0
# HELP mktxp_system_psu1_state System PSU1 state
# TYPE mktxp_system_psu1_state gauge
mktxp_system_psu1_state{routerboard_address="10.0.226.7",routerboard_name="giraite_pusyno_spinta"} 0.0
# HELP mktxp_system_psu2_state System PSU2 state
# TYPE mktxp_system_psu2_state gauge
not much difference in the code regarding how various health metrics are retrieved, so just wondering -- is it happening on multiple devices with psu1-state
available, or specifically on a given device / environment / configuration?
is happening on all devices and with devices which don't have psu_state
my config
[MKTXP]
listen = '0.0.0.0:49090' # Space separated list of socket addresses to listen to, both IPV4 and IPV6
socket_timeout = 5
initial_delay_on_failure = 120
max_delay_on_failure = 900
delay_inc_div = 5
bandwidth = False # Turns metrics bandwidth metrics collection on / off
bandwidth_test_interval = 600 # Interval for collecting bandwidth metrics
minimal_collect_interval = 5 # Minimal metric collection interval
verbose_mode = True # Set it on for troubleshooting
fetch_routers_in_parallel = True # Fetch metrics from multiple routers in parallel / sequentially
max_worker_threads = 5 # Max number of worker threads that can fetch routers (parallel fetch only)
max_scrape_duration = 30 # Max duration of individual routers' metrics collection (parallel fetch only)
total_max_scrape_duration = 90 # Max overall duration of all metrics collection (parallel fetch only)
compact_default_conf_values = False # Compact mktxp.conf, so only specific values are kept on the individual routers' level
[akademija_gw]
hostname = 213.226.176.218
[domeikava_gw]
hostname = 213.226.176.222
[ezerelis_gw]
hostname = xxx.xxx.xxx.xxx
[linksmakalnis_gw]
hostname = xxx.xxx.xxx.xxx
[raudonvaris_gw]
hostname = xxx.xxx.xxx.xxx
[ziezmariai_hq]
hostname = xxx.xxx.xxx.xxx
[giraite_pusyno_spinta]
hostname = xxx.xxx.xxx.xxx
[kaunas_lubinu_spinta]
hostname = xxx.xxx.xxx.xxx
[uzliedziai_pieniu_1_spinta]
hostname = xxx.xxx.xxx.xxx
[uzliedziai_pieniu_36_spinta]
hostname = xxx.xxx.xxx.xxx
[default]
ipsec = False
wireless_clients = False
monitor = True
use_comments_over_names = True
connection_stats = False
check_for_updates = False
wireless = False
capsman = False
ssl_certificate_verify = False
neighbor = False
user = True
plaintext_login = True
dhcp = False
interface = True
kid_control_dynamic = False
capsman_clients = False
netwatch = False
poe = False
use_ssl = False
ipv6_neighbor = False
connections = False
no_ssl_certificate = False
pool = False
ipv6_firewall = False
enabled = True
public_ip = False
kid_control_assigned = False
lte = False
firewall = False
ipv6_pool = False
installed_packages = True
dhcp_lease = False
queue = False
route = False
bgp = False
ipv6_route = False
switch_port = False
hostname = localhost
username = xxx.xxx.xxx.xxx
password = xxx.xxx.xxx.xxx
remote_dhcp_entry = None
remote_capsman_entry = None
port = 8728
yes I did another change that should help, can you try out with the latest now?
ok now it fixed :) devices only with psu_state show stats, and no more duplicates :)