Utility to diagnose systems and run recipe-like actions
- Node.js (>= 10)
- AWS EC2
- AWS RDS (MySQL/MariaDB)
- SSH
- Runs commands on servers and expects a possible output
- Reports on PDF
- Sends Slack messages if a channel is configured
Inside the configs
folder you will be able to add a config.yml
file with the following format:
config:
security:
certificates_path: $HOME/Documents/Certificates/
default_ssh_key: $HOME/.ssh/id_rsa
slack:
webhook: https://hooks.slack.com/services/123ABC456/123ABC456DEF/5648964FG78WER4H77RWH7RTY
awsCredentialsFile: $HOME/.aws/credentials
You can entirely skip the slack
section if you do not intend to send reports via this channel. More chat integrations will be coming soon. The credentials_path
is required as long as you need a certificate to enter a machine via SSH to perform tests (EC2).
Within the configs
folder, you must create a templates
folder, where you can place all of your recipes. These can be created in YML, just like the configuration file. More formats will be available in the future.
Commands and files can have patterns that can be used to introduce specific strings, like today's date. More details below.
template:
name: My Super EC2 machine cluster
type: ec2
region: ap-northeast-1
tags:
- name: My Awesome Tag
value: Some value you place
auth:
user: ubuntu
key: SuperSecretKey.pem
actions:
- name: root storage under 80%
type: command
command: df -h / | tail -n1 | tr -s " " | cut -f5 -d" " | tr -d "%"
expected_output:
type: max_value
limit: 80
- name: update os
type: command
command: sudo apt update && sudo apt upgrade -y
- name: cpu usage below 80%
type: cloudwatch
metric: CPUUtilization
expected_output:
type: below
value: 80
measure: percent
- name: cpu credits above 80 units
type: cloudwatch
metric: CPUCreditBalance
expected_output:
type: above
value: 80
measure: units
- name: mysql is running
type: command
command: sudo systemctl status mysql
expected_output:
type: pattern
value: running
- name: apache syntax
type: command
command: sudo apachectl configtest 2>&1
expected_output:
type: match
value: Syntax OK
- name: web directories owners
type: command
command: ls -l /var/www/ | grep -v apache
expected_output:
type: empty
- name: website code backup
type: s3-exists
path: s3://my-awesome-s3-backups-bucket/www/%TODAY%/website.tar.gz
- name: website database backup
type: s3-exists
path: s3://my-awesome-s3-backups-bucket/www/%TODAY%/database.sql
template:
name: My Super Awesome DB
type: rds
region: af-south-1
database_id: my-super-awesome-db
engine: mysql
profile: default
credentials:
host: my-super-awesome-db.ciymlueaqnrt.af-south-1.rds.amazonaws.com
user: username
pass: my_4w3s0m3_p4ssw0rd
db: database
actions:
- name: latest snapshot
type: snapshot
- name: can connect to db
type: connection
- name: cpu below 80%
type: cloudwatch
metric: CPUUtilization
expected_output:
type: below
value: 80
measure: percent
- name: more than 5 gb of storage available
type: cloudwatch
metric: FreeStorageSpace
expected_output:
type: above
value: 5368709120
measure: gb
- name: name of the test
- type: specified the type of test to be run, can be "command", "s3-exists" or "cloudwatch"
- command: runs an EC2 command
- expected_output: the kind of output that the system should expect from running this
- type: can be "empty", "pattern" or "match" or "max_value".
- empty: nothing is emitted
- pattern: the system will look for the string you enter in the output, if it exists the test passes
- match: the system will expect the same thing in the output and in this value, if it does not match, the test fails
- max_value: numeric values only, if the number exceeds the one given in the config file, the test fails
- value: the value that the system should expect, can be skipped if type is empty
- type: can be "empty", "pattern" or "match" or "max_value".
- expected_output: the kind of output that the system should expect from running this
- s3-exists: checks if the given file exists and is accessible within the instances' scope
- path: the path to check
- cloudwatch: checks a graph in cloudwatch since 7 days ago to the execution moment
- metric: the CloudWatch metric name to retrieve
- expected_output: the kind of output that the system should expect from running this
- type: can be "below" or "above"
- value: the number to which the system will trigger the error
- measure: it can be "percent", "gb" or "units" depending on what you would like to check
- command: runs an EC2 command