observability

There are 2590 repositories under observability topic.

  • netdata

    netdata/netdata

    The fastest path to AI-powered full stack observability, even for lean teams.

    Language:C76.6k1.3k8.1k6.2k
  • apache/skywalking

    APM, Application Performance Monitoring System

    Language:Java24.6k8145.7k6.6k
  • signoz

    SigNoz/signoz

    SigNoz is an open-source observability platform native to OpenTelemetry with logs, traces and metrics in a single application. An open-source alternative to DataDog, NewRelic, etc. 🔥 🖥. 👉 Open source Application Performance Monitoring (APM) & Observability tool

    Language:TypeScript24.2k1204k1.8k
  • mlflow

    mlflow/mlflow

    The open source developer platform to build AI/LLM applications and models with confidence. Enhance your AI applications with end-to-end tracking, observability, and evaluations, all in one integrated platform.

    Language:Python22.9k3144.7k5k
  • cilium/cilium

    eBPF-based Networking, Security, and Observability

    Language:Go22.8k30511.4k3.4k
  • jaegertracing/jaeger

    CNCF Jaeger, a Distributed Tracing Platform

    Language:Go22.1k3152.2k2.7k
  • elastic/kibana

    Your window into the Elastic Stack

    Language:TypeScript20.8k83181.8k8.5k
  • PrefectHQ/prefect

    Prefect is a workflow orchestration framework for building resilient data pipelines in Python.

    Language:Python20.8k1616.6k2k
  • vector

    vectordotdev/vector

    A high-performance observability data pipeline.

    Language:Rust20.6k1498.2k1.9k
  • langfuse

    langfuse/langfuse

    🪢 Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with OpenTelemetry, Langchain, OpenAI SDK, LiteLLM, and more. 🍊YC W23

    Language:TypeScript18.1k522k1.7k
  • Self-Hosting-Guide

    mikeroyal/Self-Hosting-Guide

    Self-Hosting Guide. Learn all about locally hosting (on premises & private web servers) and managing software applications by yourself or your organization. Including Cloud, LLMs, WireGuard, Automation, Home Assistant, and Networking.

    Language:Dockerfile17.6k20829872
  • openzipkin/zipkin

    Zipkin is a distributed tracing system

    Language:Java17.3k6711.3k3.1k
  • openobserve/openobserve

    Modern observability platform: 10x easier, 140x lower storage cost, petabyte scale. Open-source alternative to Elasticsearch/Splunk/Datadog for logs, metrics, traces, RUM, and more.

    Language:Rust17.1k902.9k688
  • kubesphere/kubesphere

    The container platform tailored for Kubernetes multi-cloud, datacenter, and edge management ⎈ 🖥 ☁️

    Language:Go16.7k2214.3k2.7k
  • VictoriaMetrics

    VictoriaMetrics/VictoriaMetrics

    VictoriaMetrics: fast, cost-effective monitoring solution and time series database

    Language:Go15.3k1504.4k1.5k
  • thanos

    thanos-io/thanos

    Highly available Prometheus setup with long term storage capabilities. A CNCF Incubating project.

    Language:Go13.8k2362.9k2.2k
  • ccfos/nightingale

    Nightingale is to monitoring and alerting what Grafana is to visualization.

    Language:Go12.6k1651.5k1.6k
  • Effect-TS/effect

    Build production-ready applications in TypeScript

    Language:TypeScript11.9k281k422
  • kubeshark

    kubeshark/kubeshark

    The API traffic analyzer for Kubernetes providing real-time K8s protocol-level visibility, capturing and monitoring all traffic and payloads going in, out and across containers, pods, nodes and clusters. Inspired by Wireshark, purposely built for Kubernetes

    Language:Go11.5k69351504
  • pyroscope

    grafana/pyroscope

    Continuous Profiling Platform. Debug performance issues down to a single line of code

    Language:Go11k891.4k703
  • howtheysre

    upgundecha/howtheysre

    A curated collection of publicly available resources on how technology and tech-savvy organizations around the world practice Site Reliability Engineering (SRE)

    Language:JavaScript9.6k23512861
  • hyperdxio/hyperdx

    Resolve production issues, fast. An open source observability platform unifying session replays, logs, metrics, traces and errors powered by Clickhouse and OpenTelemetry.

    Language:TypeScript9k27237323
  • highlight

    highlight/highlight

    highlight.io: The open source, full-stack monitoring platform. Error monitoring, session replay, logging, distributed tracing, and more.

    Language:TypeScript9k312.8k465
  • openstatus

    openstatusHQ/openstatus

    🫖 Uptime monitoring & API monitoring as code with status page 🫖

    Language:TypeScript7.9k26220514
  • coroot/coroot

    Coroot is an open-source observability and APM tool with AI-powered Root Cause Analysis. It combines metrics, logs, traces, continuous profiling, and SLO-based alerting with predefined dashboards and inspections.

    Language:Go7.1k47302327
  • projectcalico/calico

    Cloud native networking and network security

    Language:Go6.9k1183.5k1.5k
  • hertzbeat

    apache/hertzbeat

    An AI-powered next-generation open source real-time observability system.

    Language:Java6.8k661.1k1.2k
  • traceloop/openllmetry

    Open-source observability for your GenAI or LLM application, based on OpenTelemetry

    Language:Python6.6k17300824
  • open-telemetry/opentelemetry-collector

    OpenTelemetry Collector

    Language:Go6.3k913.6k1.8k
  • pixie-io/pixie

    Instant Kubernetes-Native Application Observability

    Language:C++6.2k76682478
  • OneUptime/oneuptime

    Complete open-source monitoring and observability platform.

    Language:TypeScript6.1k26561294
  • kubernetes/kube-state-metrics

    Add-on agent to generate and expose cluster-level metrics.

    Language:Go6k781k2.1k
  • GreptimeTeam/greptimedb

    Open-source, cloud-native, unified observability database for metrics, logs and traces, supporting SQL/PromQL/Streaming.

    Language:Rust5.6k541.8k425
  • ThreatMapper

    deepfence/ThreatMapper

    Open Source Cloud Native Application Protection Platform (CNAPP)

    Language:TypeScript5.2k58609636
  • coze-dev/coze-loop

    Next-generation AI Agent Optimization Platform: Cozeloop addresses challenges in AI agent development by providing full-lifecycle management capabilities from development, debugging, and evaluation to monitoring.

    Language:Go5.1k3396691
  • grafana/mimir

    Grafana Mimir provides horizontally scalable, highly available, multi-tenant, long-term storage for Prometheus.

    Language:Go4.8k1572.3k666