/grenade

Cause fires in production

Primary LanguageGo

Grenade 💣 🔥 Go Report Card

grenade is a tool for causing fires in production. I created this tool at Wattpad to answer questions like:

  • What happens when a developer deploys a change that chews up a ton of memory?
  • How quickly do I get an alert when log volume in production spikes dramatically?
  • Can a misbehaving service interfere with its neighbors in my production environment?

This is a useful activity to do particularly when operating at a scale where the blast radius of a failure in production can be quite large. For example, if you run a system where you have services with 300 replicas, it's good to know what's going to happen when you toss 300 grenades into production before someone inevitably does it by accident. This can help you:

  • Test out your observability tools.
  • Figure out what checks to put in place before a service is allowed to completely roll out in production.
  • Carry out fire drills to train on-call people.
  • Tune your monitors for pager alerts.
  • Highlight gaps in monitoring and resource limits.
  • See how resilient your infrastructure really is to unexpected failures.

⚠️ DISCLAIMER ⚠️

grenade does not simulate problems by faking monitoring data. It actually causes problems. Only deploy this to testing and pre-production environments, or if you want to upset your on-call engineer.

Build

make build will produce a binary called grenade at the top level.

Usage

grenade is a simple command line utility, see grenade --help

usage: grenade [<flags>]

Flags:
  --help                   Show context-sensitive help (also try --help-long and --help-man).
  --port=8080              Webserver port
  --delay=DELAY            trigger the grenade after specified delay e.g 5m30s
  --crash                  Process crash
  --exit                   Exit process with non-zero exit code
  --mem-spike              Cause a memory spike
  --mem-leak               Cause a memory leak
  --cpu-spike              Cause a CPU spike
  --goroutine-leak         Cause goroutines to be leaked
  --conn-spike=CONN-SPIKE  Cause a connect spike to an address
  --conn-leak=CONN-LEAK    Cause a connect leak to an address
  --dns-spike              Cause a spike in DNS lookups
  --log-spike              Cause a spike in log volume
  --fd-spike               Cause a spike in open file descriptors
  --disk-spike             Cause a spike in disk space usage
  --disk-leak              Cause a leak in disk space usage