data-governance
There are 137 repositories under data-governance topic.
datahub-project/datahub
The Metadata Platform for your Data and AI Stack
open-metadata/OpenMetadata
OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team collaboration.
sodadata/soda-core
:zap: Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io
elementary-data/elementary
The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.
MarquezProject/marquez
Collect, aggregate, and visualize a data ecosystem's metadata
reata/sqllineage
SQL Lineage Analysis Tool powered by Python
opendatadiscovery/odd-platform
First open-source data discovery and observability platform. We make a life for data practitioners easy so you can focus on your business.
odpi/egeria
Egeria core
Titan-Systems/titan
Titan Core - Snowflake infrastructure-as-code. Provision environments, automate deploys, CI/CD. Manage RBAC, users, roles, and data access. Declarative Python Resource API. Change Management tool for the Snowflake data warehouse.
data-drift/data-drift
Metrics Observability & Troubleshooting
tokern/data-lineage
Generate and Visualize Data Lineage from query history
tuva-health/tuva
Main repo including core data model, data marts, data quality tests, and terminology sets.
datachecks/dcs-core
Open Source Data Quality Monitoring.
GoogleCloudPlatform/bigquery-data-lineage
Reference implementation for real-time Data Lineage tracking for BigQuery using Audit Logs, ZetaSQL and Dataflow.
daxa-ai/pebblo
Pebblo enables developers to safely load data and promote their Gen AI app to deployment
sburn/docker-apache-atlas
This Apache Atlas is built from the latest release source tarball and patched to be run in a Docker container.
opendatadiscovery/opendatadiscovery-specification
ODD Specification is a universal open standard for collecting metadata.
hivemq/hivemq-edge
HiveMQ Edge is an MQTT gateway that enables interoperability between OT devices and IT systems. It translates diverse protocols into MQTT for streamlined communication and helps organize data into a unified namespace, making managing and streaming data across your infrastructure easier.
odpi/data-governance
Egeria's Guidance on Governance as well as large media files such as presentations and movies
mara/mara-schema
Mapping of DWH database tables to business entities, attributes & metrics in Python, with automatic creation of flattened tables
sanzujinwu/sanzuwu
三足乌数据中台融合数据接入、数据开发、数据仓库、数据治理、数据资产、数据服务、BI可视化、系统管理等功能模块为一体。打通数据壁垒,解决数据孤岛问题,助力企业数字化转型。
conduktor/conduktor-poc-kafka-protocol
POC to demonstrate how to alter incoming/outgoing records in Kafka. It's a toy, don't use it in production.
aws-samples/document-processing-pipeline-for-regulated-industries
A boilerplate solution for processing image and PDF documents for regulated industries, with lineage and pipeline operations metadata services.
provectus/data-quality-gate
Data Quality Gate based on AWS
Tinkoff/data-detective
Data catalog for everything in your company
GoogleCloudPlatform/auto-data-tokenize
Identify and tokenize sensitive data automatically using Cloud DLP and Dataflow
opendatadiscovery/odd-collector
Open-source metadata collector based on ODD Specification
sonhmai/data-systems-design
System Design, Solution Architecture, Data Systems Practice
WeBankBlockchain/Data-Export
Data-Export支持将链上数据导出到MySQL、ES等便于进行大数据处理的存储介质中,解决区块链数据复杂查询、分析、可视化和处理的问题。
bufbuild/bufstream-demo
A demo of Bufstream, a drop-in replacement for Apache Kafka that's 8x less expensive to operate and brings broker-side schema awareness to Kafka
microsoft/purview-data-governance-masterclass
An opinionated end-to-end data governance implementation.
getstrm/pace
Data policy IN, dynamic view OUT: PACE is the Policy As Code Engine. It helps you to programatically create and apply a data policy to a processing platform like Databricks, Snowflake or BigQuery (or plain 'ol Postgres, even!) with definitions imported from Collibra, Datahub, ODD and the like.
ryandawsonuk/data-platforms-tools
Guide to data platforms and tools
WeBankBlockchain/Data-Stash
Data-Stash是基于FISCO-BCOS的数据仓库组件,通过解析节点的binlog日志,生成该节点状态的全量备份,从而使节点能够实现冷热数据分离和数据裁剪。
WeBankBlockchain/Data-Reconcile
Data-Reconcile是一款基于区块链的对账组件,提供基于区块链智能合约账本的通用化数据对账解决方案,并提供了一套可动态扩展的对账框架,支持定制化开发。
tosh2230/stairlight
A data lineage tool detects table dependencies from rendered SQL statements.