ilyasstrh's Stars
wg/wrk
Modern HTTP benchmarking tool
quii/learn-go-with-tests
Learn Go with test-driven development
Netflix/chaosmonkey
Chaos Monkey is a resiliency tool that helps applications tolerate random instance failures.
systemdesign42/system-design
A resource to help you pass system design interview and become good at work 👇
Huxpro/huxpro.github.io
My Blog / Jekyll Themes / PWA
chaos-mesh/chaos-mesh
A Chaos Engineering Platform for Kubernetes.
hjacobs/kubernetes-failure-stories
Compilation of public failure/horror stories related to Kubernetes
dastergon/awesome-chaos-engineering
A curated list of Chaos Engineering resources.
litmuschaos/litmus
Litmus helps SREs and developers practice chaos engineering in a Cloud-native way. Chaos experiments are published at the ChaosHub (https://hub.litmuschaos.io). Community notes is at https://hackmd.io/a4Zu_sH4TZGeih-xCimi3Q
stern/stern
⎈ Multi pod and container log tailing for Kubernetes -- Friendly fork of https://github.com/wercker/stern
alexei-led/pumba
Chaos testing, network emulation, and stress testing tool for containers
gg-daddy/ebooks
logpai/loghub
A large collection of system log datasets for AI-driven log analytics [ISSRE'23]
open-telemetry/opentelemetry-demo
This repository contains the OpenTelemetry Astronomy Shop, a microservice-based distributed system intended to illustrate the implementation of OpenTelemetry in a near real-world environment.
alibaba/clusterdata
cluster data collected from production clusters in Alibaba for cluster management research
HugoMatilla/Effective-JAVA-Summary
Summary of the book Effective Java 2nd Edition by Joshua Bloch
spegel-org/spegel
Stateless cluster local OCI registry mirror.
techiescamp/kubernetes-prometheus
Kubernetes Manifest files for setting up Prometheus monitoring on the Kubernetes cluster.
oreilly-mlsec/book-resources
nnmer/azure-services-map
A visual representation and reference to Azure services
CloudWise-OpenSource/GAIA-DataSet
GAIA, with the full name Generic AIOps Atlas, is an overall dataset for analyzing operation problems such as anomaly detection, log analysis, fault localization, etc.
microsoft/MCW-Enterprise-class-networking
MCW Enterprise-class networking in Azure
TimeEval/evaluation-paper
Supporting material and website for the paper "Anomaly Detection in Time Series: A Comprehensive Evaluation"
aws-samples/aws-get-started-workshop
Workshop to Get Started with AWS for Production Workloads
pavolloffay/kubecon-na-2023-opentelemetry-kubernetes-metrics-tutorial
Exploring the Power of Metrics Collection with OpenTelemetry on Kubernetes
isItObservable/Episode3--Kubernetes-Fluentbit
geovane-silva/restarting-pods-report
A Python script to generate a Kubernetes restarting pods report
Ashmita152/jaeger-datasets
Jaeger's sample datasets
daviddetorres/hipster-metrics-with-prometheus
How to monitor Hipster Shop app
hfyxin/Ts-models-log-data-analysis