Sheriff is a web-based tool for server monitoring and reporting.
- keeps track of what was reported (historic values)
- keeps track of who reported (hostname/ip)
- distributes scout-compatible plugins to deputies (see deputy)
- alerts via logging / email / sms when something goes wrong
git clone git@github.com:dawanda/sheriff.git
cd sheriff
bundle
cp config/config.yml.example config/config.yml
cp config/database.yml.example config/database.yml
rake db:create
rake db:migrate
rake #run tests
rails s
curl "http://localhost:3000/notify?group=Cron.count_users&value=123"
# open "http://localhost:3000/reports/1"
# add a value validation (value: 1, warn via: email)
curl "http://localhost:3000/notify?group=Cron.count_users&value=123"
# open "http://localhost:3000/reports"
# you should see an error => group (Cron) and subgroup (count_users) are marked as error
Values get pushed to Sheriff via http get e.g. curl but preferably via deputy
curl "http://localhost:3000/notify?group=Cron.count_users&value=123"
deputy Cron.count_users 123
# report the success/failure of script execution
./database_backup ; deputy Cron.db_backup $?
Sheriff validates reported values against a set of validations to see if someone should be notified.
- ValueValidation -- reported value matches
'x', 1, 1..5, /foo/ - RunEveryValidation -- reported every 10 minutes / only once per day
- RunBetweenValidation -- reported between 00:00 and 02:00
Plugins can be stored and assigned to deputies/servers to run every x minutes/hours/days. These plugins are compatible to Scout, so you can use these 50+ existing plugins or build your own.
class Redis < Scout::Plugin
def build_report
report :memory => `/opt/redis/redis-cli info | grep used_memory: | sed s/used_memory://`.strip
end
end
Plugins are executed via deputy --run-plugins. deputy queries sheriff for plugins, assigned to this host and runs them if it's time to. The host is defined e.g. in:
#/etc/deputy.yml
sheriff_url: localhost:3000
To keep Sheriff responsive, report processing should be queued in Resque.
Install redis on localhost and set resque: true in config.yml
# config.yml
resque: true
If activated, Resque workers are started on cap deploy and Resque status can be seen at your-sheriff-url.com/resque/overview
Add hoptoad_api_key to config.yml to get errors reported to Hoptoad.
If you want performance analysis via Newrelic, add your config/newrelic.yml
You can play around with the demo at sheriff.heroku.com,
its public, so people will make crazy/dangerous plugins.
Do not run plugins via deputy.
Only ValueValidations work, since there are no cron jobs.
# configure deputy via /etc/deputy.yml or ~/.deputy.yml
sheriff_url: http://sheriff.heroku.com
# report a value
deputy Foo.bar 111
# run plugins written by annonymouse pranksters
deputy --run-plugins --no-wait
To run your own setup
Setup your heroku account
git clone https://github.com/dawanda/sheriff.git
cd sheriff
heroku create my-sheriff
Make a config in config/config.heroku.yml
sh/configure_heroku.rb
git ps heroku
Sheriff is Rails app deployed via capistrano. It needs:
- Relational database (tested with MySql/Postgres)
- Rack server (tested with passenger)
- Mail setup in e.g. sheriff/shared/config/initializers/mail.rb
- (Optional) Resque for higher responsiveness / no timeouts
- (Optional) goyyamobile.com account for sms notifications
- (Optional) Newrelic account for performance analysis
- (Optional) Hoptoad account for error reporting
For user 'deploy' group 'users' in /srv/sheriff
# on server:
sudo su
cd /srv
mkdir sheriff
chown users:deploy -R sheriff
sudo su deploy
cd /srv/sheriff
mkdir -p shared/config
mkdir -p shared/log
mkdir -p shared/pids
--- add customized shared/config/config.yml + database.yml [+ newrelic.yml]
# from your box
bundle exec cap deploy
Use anything rack-ish e.g. passenger start [OPTIONS]
passenger start --port 3000 --address myhost.com --environment production --max-pool-size 1
or add via normal apache/nginx config.
Dont let those log-files grow!
sudo ln -s /srv/sheriff/current/config/logrotate /etc/cron.d/sheriff
To notice when a report is missing we need a cron to check for it.
* * * * * cd /srv/sheriff/current && RAILS_ENV=production ruby sh/cron_minute.rb && deputy Cron.sheriff
- remove capistrano-ext dependency
- make sms provider configurable (create a gem for that ?)
- make 1.9 compatible
- highlight and notify any new error/alert message <-> set them to default email -> user can adjust down
- make plugin OPTIONS configurable