riemann/riemann-tools

Disabled checks are hard to discover

smortex opened this issue · 0 comments

When using a riemann tool that feature multiple checks, it is not easy to discover new checks that are not enabled by default.

romain@desktop-fln40kq ~/Projects/riemann-tools % bundle exec bin/riemann-health --help         
Options:
  -h, --host=<s>                       Riemann host (default: 127.0.0.1)
  -p, --port=<i>                       Riemann port (default: 5555)
  -e, --event-host=<s>                 Event hostname
  -i, --interval=<i>                   Seconds between updates (default: 5)
  -t, --tag=<s>                        Tag to add to events
  -l, --ttl=<i>                        TTL for events
  -a, --attribute=<s>                  Attribute to add to the event
  -m, --timeout=<i>                    Timeout (in seconds) when waiting for acknowledgements (default: 30)
  -c, --tcp, --no-tcp                  Use TCP transport instead of UDP (improves reliability, slight overhead. (Default: true)
  -s, --tls                            Use TLS for securing traffic
  -k, --tls-key=<s>                    TLS Key to use when using TLS
  -r, --tls-cert=<s>                   TLS Certificate to use when using TLS
  --tls-ca-cert=<s>                    Trusted CA Certificate when using TLS
  -v, --tls-verify, --no-tls-verify    Verify TLS peer when using TLS (default: true)
  -u, --cpu-warning=<f>                CPU warning threshold (fraction of total jiffies) (default: 0.9)
  --cpu-critical=<f>                   CPU critical threshold (fraction of total jiffies) (default: 0.95)
  -d, --disk-warning=<f>               Disk warning threshold (fraction of space used) (default: 0.9)
  --disk-critical=<f>                  Disk critical threshold (fraction of space used) (default: 0.95)
  -g, --disk-ignorefs=<s+>             A list of filesystem types to ignore (default: anon_inodefs, autofs, cd9660, devfs, devtmpfs, fdescfs, iso9660, linprocfs, linsysfs, nfs, overlay, procfs, tmpfs)
  -o, --load-warning=<f>               Load warning threshold (load average / core) (default: 3.0)
  --load-critical=<f>                  Load critical threshold (load average / core) (default: 8.0)
  -y, --memory-warning=<f>             Memory warning threshold (fraction of RAM) (default: 0.85)
  --memory-critical=<f>                Memory critical threshold (fraction of RAM) (default: 0.95)
  -w, --uptime-warning=<i>             Uptime warning threshold (default: 86400)
  --uptime-critical=<i>                Uptime critical threshold (default: 3600)
  -n, --users-warning=<i>              Users warning threshold (default: 1)
  --users-critical=<i>                 Users critical threshold (default: 1)
  --swap-warning=<f>                   Swap warning threshold (default: 0.4)
  --swap-critical=<f>                  Swap critical threshold (default: 0.5)
  --checks=<s+>                        A list of checks to run. (Default: cpu, load, memory, disk, swap)
  --help                               Show this message

With the above output, one must run riemann-health --checks cpu,load,memory,disk,uptime,users,swap to run with all checks, because uptime (#218) and users (#226) are not enabled by default.

Should we enable all checks by default and assume users who do not care about some of them to explicitly list the one they are interested in?