/cx_distat

Simple (and stupid) utility to collect statistics for small-sized cluster.

Primary LanguageShell

DistStat

A simple utility for collect CPU/MEM/DISK/NET statistics from small cluster.

distat is designed as a lightweight and standalone tool to collects statistics from cluster. Current statistics include:

  • CPU usage
  • Memory usage
  • Disk read rate (in KB)
  • Disk write rate (in KB)
  • Network read rate (in KB)
  • Network write rate (in KB)

Dependencies

  • ssh: dist currently use ssh to execute command on remote hosts. Also make sure that remote hosts can be login via ssh without password.
  • sar: use to do the stat work. All remote hosts should have sar installed.
  • bash: since dist is written in bash.

Usage

  1. Edit configuration distat-env.sh (optional, maybe default setting is cool for you).
  2. Run distat.

Output

The result files will be in RESULT_DIR. A result file is produced by one of below timings:

  • The size of temporary resutl file(stored in TMP_DIR) reach MAX_FILETIME limit.
  • The time of temporary result file is longer than MAX_FILESIZE(max file open time).
  • A new day come.

The format of the result file is delimited, and the field delimiter is customizable with DELIMITER. A result file may contains lines of statistics. Each line is of format (take DELIMITER=, for example):
TIMESTAMP,HOSTNAME,CPU,MEM,DISK_READ,DISK_WRITE,NET_READ,NET_WRITE
One line for each host at every collect interval.

Configuration

Item Description Default Value
MAX_FILESIZE Max size of result file (in bytes) 1048576 (1M)
MAX_FILETIME Max open time of result file (in seconds) 3153600000 (10 years)
MAX_SEQ Max sequence number of result file (as suffix) 9999
TMP_DIR Temporary directory for intermediate files. tmp
PID_FILE PID_FILE contains pid of running script (for manually killing). tmp/distat.{instanceid}.pid
LOG_DIR Log directory. logs
RESULT_DIR Result file directory. results
SLAVES A list of hosts to collect statistics. localhost
NET_IFACE Net interface device name pattern (regex). bond0
DISK_DEV Disks device name pattern (regex). sd (for SATA disks)
INTERVAL Collecting interval (in seconds). 60 (1 minute)