/sensu-plugins-disk-checks

This plugin provides native disk instrumentation for monitoring and metrics collection, including: health, usage, and various metrics.

Primary LanguageRubyMIT LicenseMIT

Sensu-Plugins-disk-checks

Build Status Gem Version Code Climate Test Coverage Dependency Status Codeship Status for sensu-plugins/sensu-plugins-disk-checks

Functionality

check-disk-fail

Check the output of dmesg for a given set of strings that may correspond to a failure

check-disk

Check disk capacity and inodes based upon the output of df.

check-disk-usage

Check disk capacity and inodes based upon the gem sys-filesystem.

Can adjust thresholds for larger filesystems by providing a 'magic factor' (-m). The default, 1.0, will not adapt threshold percentages for volumes.

The -l option can be used in combination with the 'magic factor' to specify the minimum size volume to adjust the thresholds for.

Refer to check_mk's documentation on adaptive thresholds.

You can also visualize the adjustment using WolframAlpha with the following:

y = 100 - (100-P)*(N^(1-m))/(x^(1-m)), y = P for x in 0 to 1024

Where P = base percentage, N = normalize factor, and m = magic factor

check-fs-writeable

Check to make sure a filesytem is writable. This will check both proc and do a smoke test of each given mountpoint. It can also auto-discover mount points in the self namespace.

check-fstab-mounts

Check the mount points in /etc/fstab to ensure they are all accounted for.

disk-capacity-metrics

Acquire disk capacity metrics from df and convert them to a form usable by graphite

disk-metrics

Read /proc/iostats for disk metrics and put them in a form usable by Graphite. See iostats.txt for more details.

disk-usage-metrics

Based on disk-capacity-metrics.rb by bhenerey and nstielau. The difference here being how the key is defined in graphite and the size we emit to graphite(now using megabytes), inode info has also been dropped.

check-smart-status

Check the SMART status of hardrives and alert based upon a given set of thresholds

check-smart

Check the health of a disk using smartctl

Files

  • bin/check-disk-fail.rb
  • bin/check-disk.rb
  • bin/check-disk-usage.rb
  • bin/check-fs-writable.rb
  • bin/check-fstab-mounts.rb
  • bin/check-smart-status.rb
  • bin/check-smart.rb
  • bin/metrics-disk.rb
  • bin/metrics-disk-capacity.rb
  • bin/metrics-disk-usage.rb

Usage

This is a sample input file used by check-smart-status, see the script for further details.

{
  "smart": {
    "attributes": [
      { "id": 1, "name": "Raw_read_Error_Rate", "read": "left16bit" },
      { "id": 5, "name": "Reallocated_Sector_Ct" },
      { "id": 9, "name": "Power_On_Hours", "read": "right16bit", "warn_max": 10000, "crit_max": 15000 },
      { "id": 10 , "name": "Spin_Retry_Count" },
      { "id": 184, "name": "End-to-End_Error" },
      { "id": 187, "name": "Reported_Uncorrect" },
      { "id": 188, "name": "Command_Timeout" },
      { "id": 193, "name": "Load_Cycle_Count", "warn_max": 300000, "crit_max": 600000 },
      { "id": 194, "name": "Temperature_Celsius", "read": "right16bit", "crit_min": 20, "warn_min": 10, "warn_max": 40, "crit_max": 50 },
      { "id": 196, "name": "Reallocated_Event_Count" },
      { "id": 197, "name": "Current_Pending_Sector" },
      { "id": 198, "name": "Offline_Uncorrectable" },
      { "id": 199, "name": "UDMA_CRC_Error_Count" },
      { "id": 201, "name": "Unc_Soft_read_Err_Rate", "read": "left16bit" },
      { "id": 230, "name": "Life_Curve_Status", "crit_min": 100, "warn_min": 100, "warn_max": 100, "crit_max": 100 }
    ]
  }
}

Installation

Installation and Setup

Notes