Chaos Mesh is an open source cloud-native Chaos Engineering platform. It offers various types of fault simulation and has an enormous capability to orchestrate fault scenarios.
Using Chaos Mesh, you can conveniently simulate various abnormalities that might occur in reality during the development, testing, and production environments and find potential problems in the system. To lower the threshold for a Chaos Engineering project, Chaos Mesh provides you with a visualization operation. You can easily design your Chaos scenarios on the Web UI and monitor the status of Chaos experiments.
Chaos Mesh is a Cloud Native Computing Foundation (CNCF) incubating project. If you are an organization that wants to help shape the evolution of technologies that are container-packaged, dynamically-scheduled and microservices-oriented, consider joining the CNCF. For details about who's involved and how Chaos Mesh plays a role, read the CNCF announcement.
At the current stage, Chaos Mesh has the following components:
- Chaos Operator: the core component for chaos orchestration. Fully open sourced.
- Chaos Dashboard: a Web UI for managing, designing, monitoring Chaos Experiments.
See the following demo video for a quick view of Chaos Mesh:
Chaos Operator injects chaos into the applications and Kubernetes infrastructure in a manageable way, which provides easy, custom definitions for chaos experiments and automatic orchestration. There are three components at play:
Controller-manager: used to schedule and manage the lifecycle of CRD objects.
Chaos-daemon: runs as daemonset with privileged system permissions over network, Cgroup, etc. for a specific node.
Chaos Operator uses CustomResourceDefinition (CRD) to define chaos objects. The current implementation supports a few types of CRD objects for fault injection, namely DNSChaos, PodChaos, PodIOChaos, PodNetworkChaos, NetworkChaos, IOChaos, TimeChaos, StressChaos, and KernelChaos, which correspond to the following major actions (experiments):
- pod-kill: The selected pod is killed (ReplicaSet or something similar may be needed to ensure the pod will be restarted).
- pod-failure: The selected pod will be unavailable in a specified period of time.
- container-kill: The selected container is killed in the selected pod.
- netem chaos: Network chaos such as delay, duplication, etc.
- network-partition: Simulate network partition.
- IO chaos: Simulate file system faults such as I/O delay, read/write errors, etc.
- time chaos: The selected pod will be injected with clock skew.
- cpu-burn: Simulate the CPU of the selected pod stress.
- memory-burn: Simulate the memory of the selected pod stress.
- kernel chaos: The selected pod will be injected with (slab, bio, etc) errors.
- dns chaos: The selected pod will be injected with dns errors, such as error, random.
See Chaos Mesh Docs.
See ADOPTERS.
Blogs on Chaos Mesh design & implementation, features, chaos engineering, community updates, etc. See Chaos Mesh Blogs. Here are some recommended ones for you to start with:
- Chaos Mesh 2.0: To a Chaos Engineering Ecology
- Chaos Mesh - Your Chaos Engineering Solution for System Resiliency on Kubernetes
- Run Your First Chaos Experiment in 10 Minutes
- How to Simulate I/O Faults at Runtime
- Simulating Clock Skew in K8s Without Affecting Other Containers on the Node
- Building an Automated Testing Framework Based on Chaos Mesh and Argo
See the contributing guide and development guide.
Please reach out for bugs, feature requests, and other issues via:
-
Following us on Twitter @chaos_mesh.
-
Joining the #project-chaos-mesh channel in the CNCF Slack workspace.
-
Filing an issue or opening a PR against this repository.
-
Chaos Mesh Community Monthly (Community and project-level updates, community sharing/demo, office hours)
- Time: on the fourth Thursday of every month (unless otherwise specified)
- RSVP here
- Meeting minutes
-
Chaos Mesh Development Meeting (Releases, roadmap/features/RFC planning and discussion, issue triage/discussion, etc)
- Time: Every other Tuesday (unless otherwise specified)
- RSVP here
- Meeting minutes
- Grant Tarrant-Fisher: Integrate your Reliability Toolkit with Your World
- Yoshinori Teraoka: Streake: Chaos Mesh によるカオスエンジニアリング
- Sébastien Prud'homme: Chaos Mesh : un générateur de chaos pour Kubernetes
- Craig Morten
- Ronak Banka: Getting Started with Chaos Mesh and Kubernetes
- kondoumh: Kubernetes ネイティブなカオスエンジニアリングツール Chaos Mesh を使ってみる
- Vadim Tkachenko: ChaosMesh to Create Chaos in Kubernetes
- Hui Zhang: How a Top Game Company Uses Chaos Engineering to Improve Testing
- Anurag Paliwal
- Pavan Kumar: Chaos Engineering in Kubernetes using Chaos Mesh
- Jessica Cherry: Test your Kubernetes experiments with an open source web interface
- λ.eranga: Chaos Engineering with Chaos Mesh
- Tomáš Kubica: Kubernetes prakticky: zlounství s Chaos Mesh a Azure Chaos Studio
- mend: Chaos Meshで何ができるのか見てみた
- Twain Taylor: Chaos Mesh Simplifies & Organizes Chaos Engineering For Kubernetes
- Saiyam Pathak
- CodeZine: オープンソースのカオステストツール「Chaos Mesh 1.0」、一般提供を開始
- @IT atmarkit: Kubernetes 向けカオスエンジニアリングプラットフォーム「Chaos Mesh 1.0」が公開
- Publickey: Kubernetes の Pod やネットワークをわざと落としまくってカオスエンジニアリングのテストができる「Chaos Mesh」がバージョン 1.0 に到達
- InfoQ: Chaos Engineering on Kubernetes : Chaos Mesh Generally Available with v1.0
- TechGenix: Chaos Mesh Promises to Bring Order to Chaos Engineering
See FAQs.
See ROADMAP.
Chaos Mesh is licensed under the Apache License, Version 2.0. See LICENSE for the full license text.
Chaos Mesh is a trademark of The Linux Foundation. All rights reserved.