/overwatch

Linux system's health monitor

Primary LanguageShell

Overwatch

In every Linux server, the load average is a critical metric to measure the system's health.

root@localhost:~# uptime
 17:46:36 up 9 days,  8:19,  1 user,  load average: 0.00, 0.00, 0.00

The load average is the number of processes that are running or uninterruptible; the run-queue length or CPU load. This metric doesn't include processes/threads waiting in I/O, networking and databases. It narrowly focuses on what is actively demanding CPU time. A process in a runnable state is either using the CPU or waiting to use the CPU. In an uninterruptible state, it is waiting for I/O access.

Overwatch monitor this metric by relying on the uptime output.

Usage

chmod x-ug ~/bin/overwatch
watch -n1 '~/bin/overwatch >> ~/log/overwatch.log' 2&>1 &

Notification:

slack-report