Faulty drive/degraded state doesn't trigger anything
Hyrla opened this issue · 4 comments
Hello,
I setup this wonderful plugin successfully. However, there is not alerts triggered even if I manually set a disk to faulty state.
My Zabbix Agent is used in passive mode so I just changed "Zabbix Agent (active)" to "Zabbix Agent" in discovery templates settings.
Here's the output of few debug commands :
su -c 'zabbix_agentd -t rabe.raid.md.raid-device.discovery' -s /bin/bash zabbix
rabe.raid.md.raid-device.discovery [t|{"data":[{"{#MD_RAID_RAID_DEV_NAME}":"md0"}]}]
mdadm -D /dev/md0
/dev/md0:
Version : 1.2
Creation Time : Fri Mar 27 19:42:13 2020
Raid Level : raid5
Array Size : 3906762752 (3725.78 GiB 4000.53 GB)
Used Dev Size : 1953381376 (1862.89 GiB 2000.26 GB)
Raid Devices : 3
Total Devices : 3
Persistence : Superblock is persistent
Intent Bitmap : Internal
Update Time : Sun Mar 29 13:25:00 2020
State : clean, degraded
Active Devices : 2
Working Devices : 2
Failed Devices : 1
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 512K
Consistency Policy : unknown
Name : scarif:0 (local to host scarif)
UUID : a54c09d4:af8606cc:97bc4ab1:5d7c5f77
Events : 23977
Number Major Minor RaidDevice State
0 8 1 0 active sync /dev/sda1
- 0 0 1 removed
3 8 33 2 active sync /dev/satac1
1 8 17 - faulty /dev/satab1
My host's items discovered screenshot
https://i.ibb.co/FVcQpMM/Capture-d-cran-de-2020-03-29-17-04-49.png
My host's triggers for MD RAID active
https://i.ibb.co/g9qs4rj/Capture-d-cran-de-2020-03-29-17-08-21.png
Am I doing anything wrong ?
Thank you :)
Is the array showing up as degraded in /sys/block/md0/md/degraded
?
@paraenggu PTAL
@Hyrla sorry for the late reply, I overlooked this one.
A degraded drive should at least fire two triggers:
RAID array device MD {#MD_RAID_RAID_DEV_NAME} has {ITEM.VALUE1} degraded device(s) on {HOST.NAME}
RAID component device MD {#MD_RAID_RAID_DEV_NAME} RD {#MD_RAID_COMPONENT_DEV_NAME} is in {ITEM.VALUE1} state on {HOST.NAME}
In your case above:
RAID array device MD md0 has 1 degraded device(s) on {HOST.NAME}
RAID component device MD md0 RD satab1 is in faulty state on {HOST.NAME}
Cloud you please post the relevant latest values of the affected host?