/awesome-chaos-engineering

A curated list of Chaos Engineering resources.

Creative Commons Zero v1.0 UniversalCC0-1.0

Awesome Chaos Engineering Awesome

A curated list of awesome Chaos Engineering resources.

What is Chaos Engineering?

Chaos Engineering is the discipline of experimenting on a distributed system in order to build confidence in the system’s capability to withstand turbulent conditions in production. - Principles Of Chaos Engineering website.

Contents

Culture

Books

Education

Notable Tools

  • Chaos Monkey - A resiliency tool that helps applications tolerate random instance failures.
  • The Simian Army - A suite of tools for keeping your cloud operating in top form.
  • orchestrator - MySQL replication topology management and HA.
  • kube-monkey - An implementation of Netflix's Chaos Monkey for Kubernetes clusters.
  • Gremlin Inc. - Failure as a Service.
  • Pumba - Chaos testing and network emulation for Docker containers (and clusters).
  • Chaos Toolkit - A chaos engineering toolkit to help you build confidence in your software system.
  • ChaoSlingr - Introducing Security Chaos Engineering. ChaoSlingr focuses primarily on the experimentation on AWS Infrastructure to proactively instrument system security failure through experimentation.
  • PowerfulSeal - Adds chaos to your Kubernetes clusters, so that you can detect problems in your systems as early as possible. It kills targeted pods and takes VMs up and down.
  • drax - DC/OS Resilience Automated Xenodiagnosis tool. It helps to test DC/OS deployments by applying a Chaos Monkey-inspired, proactive and invasive testing approach.
  • Wiremock - API mocking (Service Virtualization) which enables modeling real world faults and delays
  • MockLab - API mocking (Service Virtualization) as a service which enables modeling real world faults and delays.
  • Pod-Reaper - A rules based pod killing container. Pod-Reaper was designed to kill pods that meet specific conditions that can be used for Chaos testing in Kubernetes.
  • Muxy - A chaos testing tool for simulating a real-world distributed system failures.
  • Toxiproxy - A TCP proxy to simulate network and system conditions for chaos and resiliency testing.
  • Blockade - Docker-based utility for testing network failures and partitions in distributed applications.
  • chaos-lambda - Randomly terminate ASG instances during business hours.
  • Namazu - Programmable fuzzy scheduler for testing distributed systems.
  • Chaos Monkey for Spring Boot - Injects latencies, exceptions, and terminations into Spring Boot applications
  • Byte-Monkey - Bytecode-level fault injection for the JVM. It works by instrumenting application code on the fly to deliberately introduce faults like exceptions and latency.
  • GomJabbar - ChaosMonkey for your private cloud
  • Turbulence - Tool focused on BOSH environments capable of stressing VMs, manipulating network traffic, and more. It is very simmilar to Gremlin.
  • Chaos Monkey for Spring Boot - Chaos Monkey for Spring Boot

Cloud Services

Papers

Blogs & Newsletters

Conferences & Meetups

Forums

Twitter

Contributing

Please take a look at the contribution guidelines first. Contributions are always welcome!