/darkseal

a data discovery and metadata engine

Primary LanguagePythonApache License 2.0Apache-2.0

logo
a data discovery and metadata engine

Darkseal

Data Discovery Portal

Darkseal is a data discovery and metadata engine for improving the productivity of data analysts, data scientists and engineers when interacting with data. It does that today by indexing data resources (tables, dashboards, streams, etc.) and powering a page-rank style search based on usage patterns (e.g. highly queried tables show up earlier than less queried tables). Think of it as Google search for data. The project is named after Norwegian explorer Roald Darkseal, the first person to discover the South Pole.

Requirements

  • Python = 3.6 or 3.7
  • Node = v10 or v12 (v14 may have compatibility issues)
  • npm >= 6

Getting Started

Please visit the Darkseal installation documentation for a quick start to bootstrap a default version of Darkseal with dummy data.

Architecture Overview

Please visit Architecture for Darkseal architecture overview.

Supported Entities

  • Tables (from Databases)
  • People (from HR systems)
  • Dashboards

Supported Integrations

Table Connectors

redshift druid hive big-query es databricks dremio oracle postgres presto snowflake delta
Darkseal can also connect to any database that provides `dbapi` or `sql_alchemy` interface (which most DBs provide).

Table Column Statistics

Dashboard Connectors

ETL Orchestration

License

Apache License