librenms/librenms-agent

zfs extension stopped logging on FreeBSD 14 - also encounters uninitialized values

Closed this issue · 8 comments

Logging of zfs stats has stopped working after upgrading a host from FreeBSD 13.2 to FreeBSD 14.0 on Sunday.

On FreeBSD 14, with code pulled down from today, I get.

[21:25 r730-03 dvl /usr/local/etc/snmp] % /usr/local/etc/snmp/zfs-freebsd
Use of uninitialized value in subtraction (-) at /usr/local/etc/snmp/zfs-freebsd line 266.
Use of uninitialized value in division (/) at /usr/local/etc/snmp/zfs-freebsd line 267.
{"version":3,"error":0,"errorString":"","data":{"faulted":0,"mru_ghost_hits":279522,"l2_rebuild_bufs":"0","pre_data_total":17156808,"evict_skip":"185","arc_size_per":54.2035791225306,"meta_hits_per":83.4190690139206,"anon_hits_per":-0.0200643622613628,"evict_l2_eligible_mfu":"368036879872","l2_asize":"0","mfu_per":90.8189974473971,"pre_meta_hits":22721021,"data_pre_per":8.99577007564577,"l2_misses":0,"l2_read_bytes":"0","l2_writes_lock_retry":"0","evict_l2_cached":"0","mru_ghost_per":0.0135464104479553,"l2_rebuild_dh_errors":"0","l2_evict_reading":"0","pre_meta_misses_per":6.04980783869206,"mfu_hits":1873995174,"l2_free_on_write":"0","demand_data_misses":2283349,"l2_access_total":0,"online":2,"l2_hdr_size":"0","arc_misses":19337920,"evict_l2_eligible":"3732548153856","anon_hits":-414016,"l2_hits":0,"l2_log_blk_avg_asize":"0","l2_log_blk_count":"0","actual_hit_per":99.0715321417107,"demand_meta_hits":1721302119,"unknown":0,"deleted":"25233563","l2_rebuild_unsupported":"0","freq_used_per":100,"pools":[{"asyncq_write_a":"0","alloc":"18236652015616","bandwidth_r":"208","frag":"13","total_wait_r":"29249561","asyncq_read_a":"0","size":"23983097380864","dedup":"1.00","scrubq_read_a":"0","syncq_write_p":"0","cap":"76","altroot":"-","asyncq_write_p":"0","operations_w":"265","free":"5746445365248","bandwidth_w":"153663026","trim_wait":"25423399","syncq_wait_r":"15666359","ckpoint":"0","trimq_write_p":"0","syncq_wait_w":"952343","asyncq_read_p":"0","disk_wait_w":"9648927","operations_r":"5746445365248","name":"data01","scrubq_read_p":"0","asyncq_wait_w":"552080176","syncq_write_a":"0","syncq_read_a":"0","health":0,"syncq_read_p":"53391392","expandsz":0,"total_wait_w":"53527611","scrub_wait":"6215895","disk_wait_r":"255929172","trimq_write_a":"0"},{"expandsz":0,"syncq_read_p":"937798","health":0,"syncq_read_a":"0","trimq_write_a":"0","disk_wait_r":"452036","scrub_wait":"141109","total_wait_w":"202345","disk_wait_w":"178849","asyncq_read_p":"0","syncq_wait_w":"1381","syncq_write_a":"0","asyncq_wait_w":"406","scrubq_read_p":"0","name":"zroot","operations_r":"434860814336","free":"434860814336","operations_w":"0","asyncq_write_p":"0","trimq_write_p":"0","syncq_wait_r":"140837","ckpoint":"0","trim_wait":"329403","bandwidth_w":"8823","asyncq_read_a":"0","total_wait_r":"311418","frag":"8","alloc":"7520817152","bandwidth_r":"20","asyncq_write_a":"0","altroot":"-","syncq_write_p":"0","cap":"1","scrubq_read_a":"0","dedup":"1.00","size":"442381631488"}],"offline":0,"l2_rebuild_io_errors":"0","p":null,"l2_rebuild_success":"0","l2_rw_clash":"0","evict_l2_eligible_mru":"3364511273984","target_size_max":67505516544,"l2_log_blk_asize":"0","unavail":0,"health":1,"pre_meta_misses":1169907,"cache_miss_per":0.928467858289338,"l2_rebuild_cksum_lb_errors":"0","mfu_size":36742624663,"recycle_miss":0,"evict_l2_ineligible":"76875616256","removed":0,"pre_misses_per":80.7399192881137,"l2_rebuild_asize":"0","l2_mru_asize":"0","target_size_arat":0.544290697176596,"l2_writes_done":"0","pre_data_hits":1543387,"l2_feeds":"0","l2_rebuild_lowmem":"0","meta_misses_per":1.40264826827291,"l2_cksum_bad":0,"l2_rebuild_size":"0","demand_hits_per":15.4050105574739,"l2_mfu_asize":"0","target_size":36742624663,"degraded":0,"l2_writes_error":0,"arc_hits":2063439618,"min_size_per":3.17470620731141,"l2_errors":0,"demand_meta_misses":271243,"l2_bufc_metadata_asize":"0","target_size_min":2143101824,"pre_meta_hits_per":1.10112361911624,"real_hits":2063439618,"l2_data_to_meta_ratio":"0","l2_bufc_data_asize":"0","rec_used_per":0,"evict_l2_skip":"0","l2_prefetch_asize":"0","arc_accesses_total":2082777538,"mfu_ghost_per":0.00651795181340751,"pre_data_misses":15613421,"l2_size":"0","l2_evict_lock_retry":"0","l2_io_error":0,"l2_rebuild_bufs_precached":"0","demand_data_total":320156440,"data_demand_per":99.2868021021223,"mru_hits":189444444,"cache_hits_per":99.0715321417107,"mutex_skip":"34138","demand_data_hits":317873091,"arc_size":36590406072,"l2_log_blk_writes":"0","mfu_ghost_hits":134494,"l2_evict_l1cached":"0","pre_hits_per":0.0747968094891934,"l2_abort_lowmem":"0","l2_writes_sent":"0","target_size_per":54.4290697176596,"l2_rebuild_log_blks":"0","mru_per":9.18100255260292,"demand_misses_per":11.8076246049213,"l2_write_bytes":"0"}}

If it helps:

[21:26 r730-03 dvl /usr/local/etc/snmp] % zpool list
NAME     SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
data01  21.8T  16.6T  5.23T        -         -    13%    76%  1.00x    ONLINE  -
zroot    412G  7.01G   405G        -         -     8%     1%  1.00x    ONLINE  -

Hi,

I bumped into this yesterday.

I found sysctl kstat.zfs.misc.arcstats.p is not present in FreeBSD 14.

I have been using zfs-freebsd.py and I'm trying this naive diff:

diff -r 64c295527140 snmp/zfs-freebsd.py
--- a/snmp/zfs-freebsd.py       Tue Dec 19 11:12:18 2023 +1030
+++ b/snmp/zfs-freebsd.py       Wed Dec 20 08:37:10 2023 +1030
@@ -7,7 +7,7 @@
 import subprocess
 
 SYSCTL = "/sbin/sysctl"
-ZPOOL = "/usr/local/sbin/zpool"
+ZPOOL = "/sbin/zpool"
 
 
 def percent(numerator, denominator, default=0):
@@ -72,13 +72,24 @@
     output["target_size"] = stats["kstat.zfs.misc.arcstats.c"]
 
     # ARC size breakdown
-    output["mfu_size"] = (
-        stats["kstat.zfs.misc.arcstats.size"] - stats["kstat.zfs.misc.arcstats.p"]
-    )
-    output["p"] = stats["kstat.zfs.misc.arcstats.p"]
-    output["rec_used_per"] = (
-        stats["kstat.zfs.misc.arcstats.p"] / stats["kstat.zfs.misc.arcstats.size"] * 100
-    )
+    if "kstat.zfs.misc.arcstats.p" in stats:
+        output["mfu_size"] = (
+            stats["kstat.zfs.misc.arcstats.size"] - stats["kstat.zfs.misc.arcstats.p"]
+        )
+        output["p"] = stats["kstat.zfs.misc.arcstats.p"]
+        output["rec_used_per"] = (
+            stats["kstat.zfs.misc.arcstats.p"] / stats["kstat.zfs.misc.arcstats.size"] * 100
+        )
+    else:
+        # kstat.zfs.misc.arcstats.p doesn't seem to exist in FreeBSD 14.0
+        output["mfu_size"] = stats["kstat.zfs.misc.arcstats.mfu_size"]
+        output["p"] = (
+            stats["kstat.zfs.misc.arcstats.size"] - stats["kstat.zfs.misc.arcstats.mfu_size"]
+        )
+        output["rec_used_per"] = (
+            output["p"] / stats["kstat.zfs.misc.arcstats.size"] * 100
+        )
+
     output["freq_used_per"] = (
         output["mfu_size"] / stats["kstat.zfs.misc.arcstats.size"] * 100
     )

@dlangille @Bobzikwick

Looking at that now. Also you both are using really out of date extends as well. https://github.com/librenms/librenms-agent/blob/master/snmp/zfs is the one you should be using for either Linux or FreeBSD.

ohh! derp, just realized it said v. 3 there, okay that is the right one, you just have it named like the really old one

@Bobzikwick Thanks for figuring out a work around for when p is missing there. The extend has been updated now.

@dlangille Let me know if you have any issues with the new one.

@VVelox For graphing, it's nan across the board. Only one host has been updated with the new zfs script.

So I did this: at device/120/capture, click on Poller, then on Run, then on Download.

Looking in the file, I see:

Application: zfs, app_id=4978SNMP['/usr/local/bin/snmpget' '-v3' '-l' 'authPriv' '-n' "" '-a' 'SHA' '-A' 'PASSWORD' '-u' 'USER' '-x' 'AES' '-X' 'PASSWORD' '-Oqv' '-m' 'NET-SNMP-EXTEND-MIB' '-M' '/usr/local/www/librenms/mibs' 'udp:HOSTNAME:161' 'nsExtendOutputFull.3.122.102.115']
env: perl: No such file or directory  
  

zfs:-3:Invalid JSON
SQL[update `applications` set `timestamp` = NOW() where `app_id` = ? [4978] 0.87ms] 

I suspect that is because perl is on on the PATH.

My fix: #!/usr/local/bin/perl

This is a frequent patch FYI (e.g. https://cgit.freebsd.org/ports/tree/net-mgmt/librenms/files/patch-LibreNMS_wrapper.py)

Now my graphs are working.

@dlangille This can easily be fixed via prefixing the command with env like this...

extend logsize /usr/bin/env PATH=/sbin:/bin:/usr/sbin:/usr/bin:/usr/local/sbin:/usr/local/bin:/root/bin /usr/local/etc/snmp-extends/logsize -b

@VVelox That's a clever way and puts it all within snmpd.conf without affecting the original script.

@dlangille Yeah. Got in the habit of that as lots of OSes don't have meaningful value populated for $ENV{PATH} for when it comes to snmpd. Also great to set for cron as well. Although there there can easily just be set once instead of needing env.