Awesome Data Discovery and Observability
This repository contains a curated collection of awesome data data catalogs and observability platforms that will help you discover, observe and manage data in your organization.
Contents: Existing Data Discovery and Observability Solutions
Open-Source Data Catalogs
Based on Open Standard
Federation
ML 1st Citizen
Data Quality
End-to-end Lineage
Observability
β
β
β
β
β
β
More features
Strategy: Push
UX personalization: No
AI autowiring: No
Network-based: Yes
Rich data profiling: No
Supported data sources:
Search-based: Yes
Recommendations: Yes
Schemas, Description: Yes
Complex schemas: No
Data preview: Yes
Column statistics: Yes
Data owner: Yes
Top data users: Yes
Change notifications: No
Change feed: No
Supported data sources: Hive, Redshift, Druid, RDBMS, Presto, Snowflake
Based on Open Standard
Federation
ML 1st Citizen
Data Quality
End-to-end Lineage
Observability
β
β
β
β
β
β
More features
Strategy: Push, Pull
UX personalization: No
AI autowiring: No
Network-based: Yes
Rich data profiling: No
Supported data sources: Hive, Kafka, RDBMS
Based on Open Standard
Federation
ML 1st Citizen
Data Quality
End-to-end Lineage
Observability
OpenLineage
β
β
β
β
β
More features
Strategy: Push
UX personalization: No
AI autowiring: No
Network-based: No
Rich data profiling: No
Supported data sources: S3, Kafka
Based on Open Standard
Federation
ML 1st Citizen
Data Quality
End-to-end Lineage
Observability
β
β
β
β
β
β
More features
Strategy: Push
UX personalization: No
AI autowiring: No
Network-based: No
Rich data profiling: No
Supported data sources: HBase, Hive, Sqoop, Kafka, Storm
Based on Open Standard
Federation
ML 1st Citizen
Data Quality
End-to-end Lineage
Observability
β
βοΈ
β
β
β
β
More features
Strategy: Push
UX personalization: No
AI autowiring: No
Network-based: No
Rich data profiling: No
Supported data sources:
Based on Open Standard
Federation
ML 1st Citizen
Data Quality
End-to-end Lineage
Observability
β
βοΈ
β
β
β
β
More features
Strategy: Push via UI
UX personalization: No
AI autowiring: No
Network-based: No
Rich data profiling: No
Supported data sources: Mostly GeoData
Proprietary Data Catalogs
Based on Open Standard
Federation
ML 1st Citizen
Data Quality
End-to-end Lineage
Observability
β
β
β
?
β
β
More features
Strategy: Push
UX personalization: Yes
AI autowiring: ?
Network-based: No
Rich data profiling: ?
Supported data sources:
Based on Open Standard
Federation
ML 1st Citizen
Data Quality
End-to-end Lineage
Observability
β
β
β
?
β
β
More features
Strategy: Push
UX personalization: ?
AI autowiring: ?
Network-based: Yes
Rich data profiling: Yes
Supported data sources:
Based on Open Standard
Federation
ML 1st Citizen
Data Quality
End-to-end Lineage
Observability
β
β
β
βοΈ
β
β
More features
Strategy: Push
UX personalization: Yes
AI autowiring: No
Network-based: No
Rich data profiling: No
Supported data sources:
Based on Open Standard
Federation
ML 1st Citizen
Data Quality
End-to-end Lineage
Observability
β
β
β
βοΈ
β
β
More features
Strategy: Pull
UX personalization: ?
AI autowiring: ?
Network-based: No
Rich data profiling: ?
Supported data sources: Presto, Deequ, Atlas, Airflow, Hudi
Based on Open Standard
Federation
ML 1st Citizen
Data Quality
End-to-end Lineage
Observability
β
β
β
βοΈ
β
β
More features
Strategy: Pull
UX personalization: Yes
AI autowiring: No
Network-based: No
Rich data profiling: Yes
Supported data sources:
Based on Open Standard
Federation
ML 1st Citizen
Data Quality
End-to-end Lineage
Observability
β
β
β
β
β
β
More features
Strategy: Push
UX personalization: No
AI autowiring: No
Network-based: No
Rich data profiling: No
Supported data sources:
Based on Open Standard
Federation
ML 1st Citizen
Data Quality
End-to-end Lineage
Observability
β
β
β
βοΈ
β
β
More features
Strategy: Push
UX personalization: Yes
AI autowiring: ?
Network-based: ?
Rich data profiling: Yes
Supported data sources:
Google Cloud Data Catalog
Based on Open Standard
Federation
ML 1st Citizen
Data Quality
End-to-end Lineage
Observability
β
β
β
?
β
β
More features
Strategy: Pull
UX personalization: ?
AI autowiring: ?
Network-based: No
Rich data profiling: No
Supported data sources:
Based on Open Standard
Federation
ML 1st Citizen
Data Quality
End-to-end Lineage
Observability
β
β
β
?
β
β
More features
Strategy: Pull
UX personalization: ?
AI autowiring: ?
Network-based: ?
Rich data profiling: ?
Supported data sources:
Data Observability Platforms
Based on Open Standard
Federation
ML 1st Citizen
Data Quality
End-to-end Lineage
Observability
β
β
β
βοΈ
β
βοΈ
More features
Strategy: Pull
UX personalization: ?
AI autowiring: ?
Network-based: ?
Rich data profiling: ?
Supported data sources: Snowflake, Hive, Kafka, Looker, Redshift, Tableau, Big Query, Airflow, Fivetran, Presto, Mode, Periscope, Databricks, Glue, dbt, Chartio, Spark, AWS, S3, data.world, Google Cloud Platform
Based on Open Standard
Federation
ML 1st Citizen
Data Quality
End-to-end Lineage
Observability
β
β
β
βοΈ
β
β
More features
Strategy: Push
UX personalization: ?
AI autowiring: ?
Network-based: ?
Rich data profiling: ?
Supported data sources:
Based on Open Standard
Federation
ML 1st Citizen
Data Quality
End-to-end Lineage
Observability
β
β
β
βοΈ
β
β
More features
Strategy: Push
UX personalization: ?
AI autowiring: ?
Network-based: ?
Rich data profiling: ?
Supported data sources: