Automatron (Ah-Tom-a-tron) is an open source framework designed to detect and remediate IT systems issues. Meaning, it can be used to monitor systems and when it detects issues; correct them.
- Automatically detect and add new systems to monitor
- Monitoring is executed over SSH and completely agent-less
- Policy based Runbooks allow for monitoring policies rather than server specific configurations
- Supports Nagios compliant health check scripts
- Allows arbitrary shell commands for both checks and actions
- Runbook flexibility with Jinja2 templating support
- Pluggable Architecture that simplifies customization
Automatron's actions are driven by policies called Runbooks. These runbooks are used to define what health checks should be executed on a target host and what to do about those health checks when they fail.
The below example is a Runbook that will execute a monitoring plugin to determine the amount of free space on /var/log
and based on the results execute a corrective action.
name: Verify /var/log
schedule: "*/2 * * * *"
nodes:
- "*"
checks:
mem_free:
# Check for the % of disk free create warning with 20% free and critical for 10% free
execute_from: ontarget
type: plugin
plugin: systems/disk_free.py
args: --warn=20 --critical=10 --filesystem=/var/log
actions:
logrotate_nicely:
execute_from: ontarget
trigger: 0
frequency: 300
call_on:
- WARNING
type: cmd
cmd: bash /etc/cron.daily/logrotate
logrotate_forced:
execute_from: ontarget
trigger: 5
frequency: 300
call_on:
- CRITICAL
type: cmd
cmd: bash /etc/cron.daily/logrotate --force
Jinja2 support was added to Runbooks to allow for extensive customization. The below example shows using Jinja2 to determine which cmd
to execute based on Automatron's facts system.
This example will detect if nginx
is running and if not, restart it.
name: Verify nginx is running
schedule: "*/5 * * * *"
nodes:
- "*web*"
checks:
nginx_is_running:
# Check if nginx is running
execute_from: ontarget
type: cmd
{% if "Linux" in facts['os'] %}
cmd: service nginx status
{% else %}
cmd: /usr/local/etc/rc.d/nginx status
{% endif %}
actions:
restart_nginx:
execute_from: ontarget
trigger: 2
frequency: 300
call_on:
- WARNING
- CRITICAL
type: cmd
{% if "Linux" in facts['os'] %}
cmd: service nginx restart
{% else %}
cmd: /usr/local/etc/rc.d/nginx restart
{% endif %}
- Follow our quick start guide: Automatron in 10 minutes
- Check out example Runbooks for automating common tasks
- Read our Runbook Reference documentation to better understand the anatomy of a Runbook
- Follow @Automatronio on Twitter to keep up to date
- Join #Automatron on Gitter for help or just to hang out