data-observability

There are 34 repositories under data-observability topic.

  • OpenMetadata

    open-metadata/OpenMetadata

    OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team collaboration.

    Language:TypeScript4.7k466.7k913
  • soda-core

    sodadata/soda-core

    :zap: Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io

    Language:Python1.8k11344192
  • elementary

    elementary-data/elementary

    The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.

    Language:HTML1.8k9506153
  • re-data/re-data

    re_data - fix data issues before your users & CEO would discover them 😊

    Language:HTML1.5k24196121
  • odd-platform

    opendatadiscovery/odd-platform

    First open-source data discovery and observability platform. We make a life for data practitioners easy so you can focus on your business.

    Language:Java1.1k1962796
  • piperider

    InfuseAI/piperider

    Code review for data in dbt

    Language:Python474147422
  • elementary-data/dbt-data-reliability

    dbt package that is part of Elementary, the dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.

    Language:Python35952078
  • data-drift/data-drift

    Metrics Observability & Troubleshooting

    Language:HTML30554811
  • datachecks

    waterdipai/datachecks

    Open Source Data Quality Monitoring.

    Language:Python13326017
  • re-data/dbt-re-data

    re_data - fix data issues before your users & CEO would discover them 😊

    Language:Python9641339
  • swiple

    Swiple/swiple

    Swiple enables you to easily observe, understand, validate and improve the quality of your data

    Language:Python782310
  • dqo

    dqops/dqo

    Data Quality and Observability platform for the whole data lifecycle, from profiling new data sources to full automation with Data Observability. Configure data quality checks from the UI or in YAML files, let DQOps run the data quality checks daily to detect data quality issues.

    Language:Java706912
  • oslabs-beta/DataDoc

    Endpoint downtime detection, monitoring, and traffic simulation developer tool

    Language:JavaScript66301
  • sodadata/soda-spark

    Soda Spark is a PySpark library that helps you with testing your data in Spark Dataframes

    Language:Python6311488
  • data-observability-installer

    DataKitchen/data-observability-installer

    Installer for DataKitchen's Open Source Data Observability Products. Data breaks. Servers break. Your toolchain breaks. Ensure your team is the first to know and the first to solve with visibility across and down your data estate. Save time with simple, fast data quality test generation and execution. Trust your data, tools, and systems end to end.

    Language:Python584144
  • opendatadiscovery/odd-collector

    Open-source metadata collector based on ODD Specification

    Language:Python4237813
  • dataops-observability

    DataKitchen/dataops-observability

    DataOps Observability is part of DataKitchen's Open Source Data Observability. DataOps Observability monitors every data journey from data source to customer value, from any team development environment into production, across every tool, team, environment, and customer so that problems are detected, localized, and understood immediately.

    Language:Python30272
  • dataops-testgen

    DataKitchen/dataops-testgen

    DataOps TestGen is part of DataKitchen's Open Source Data Observability. DataOps TestGen delivers simple, fast data quality test generation and execution by data profiling,  new dataset screening and hygiene review, algorithmic generation of data quality validation tests, ongoing testing of new data refreshes, & continuous data anomaly monitoring

    Language:Python25240
  • dataops-observability-agents

    DataKitchen/dataops-observability-agents

    DataOps Observability Integration Agents are part of DataKitchen's Open Source Data Observability. They connect to various ETL, ELT, BI, data science, data visualization, data governance, and data analytic tools. They provide logs, messages, metrics, overall run-time start/stop, subtask status, and scheduling information to DataOps Observability.

    Language:Python20241
  • open-metadata/openmetadata-site

    Open Standard for Metadata. A Single place to Discover, Collaborate and Get your data right.

    Language:CSS11389
  • sodadata/soda-github-action

    :zap: Prevent downstream data quality issues by integrating the Soda Library into your CI/CD pipeline.

    Language:Python11610
  • kiwicom/terraform-provider-montecarlo

    This open-source Terraform provider enables users to seamlessly integrate the Monte Carlo data reliabillity platform into their infrastructure as a code (IaC) workflows.

    Language:Go9622
  • cgnorthcutt/reliablity_framework_for_rag

    Demo showing how the Trustworthy Language Model add reliability to LLM outputs and improves RAG, agents, and data enrichment worfklows. can be used to improve fine-tuning of LLMs, accuracy of LLM outputs, and smart routing for RAG and agents.

    Language:Jupyter Notebook4202
  • datasphere-oss/datasphere

    DataSphere is the first open-source cloud-native data observability platform that helps you trace the whole data infrastructure in your warehouses, lakes and databases.

    Language:Java4324
  • GuinsooLab/stealthward

    dbt native framework built to observe modern data stack

    Language:HTML4020
  • opendatadiscovery/odd-collector-gcp

    Open-source GCP metadata collector based on ODD Specification

    Language:Python43150
  • annamatias/dataengineer

    Códigos, plataformas, ferramentas e processos em alta;

    Language:Python1100
  • Jaimeloeuf/Jevents

    A simple to use EventEmitter and Data-Observer python package.

    Language:Python1
  • swiple-action

    Swiple/swiple-action

    Automatically validate datasets, poll task status, and display validation results in a GitHub using Swiple pull request.

    Language:Python1100
  • SachinVarghese/pgamber

    Data observability for postgreSQL using alibi-detect

    Language:Go0100
  • mark-antal-csizmadia/em-simple

    Expectation Maximization (EM) algorithm for estimating maximum likelihood (ML) parameters of partially observed data on a three-node Bayesian Network Probabilistic Graphical Model.

    Language:Jupyter Notebook20
  • rishuatgithub/data-lin-observability

    Data Lineage Observability Project

    Language:Shell10