elt

There are 293 repositories under elt topic.

  • airflow

    apache/airflow

    Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

    Language:Python35k7569.1k13.7k
  • airbytehq/airbyte

    The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

    Language:Python14.5k17913.7k3.7k
  • apache/doris

    Apache Doris is an easy-to-use, high performance and unified analytics database.

    Language:Java11.6k2826.9k3.1k
  • dbt-core

    dbt-labs/dbt-core

    dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.

    Language:Python9.1k1375.2k1.5k
  • apache/seatunnel

    SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.

    Language:Java7.5k1732.9k1.6k
  • mage-ai/mage-ai

    🧙 Build, run, and manage data pipelines for integrating and transforming data.

    Language:Python7.3k63690666
  • kestra

    kestra-io/kestra

    Infinitely scalable, event-driven, language-agnostic orchestration and scheduling platform to manage millions of workflows declaratively in code.

    Language:Java6.9k621.7k400
  • cloudquery

    cloudquery/cloudquery

    The open source high performance ELT framework powered by Apache Arrow

    Language:Go5.6k582.2k499
  • apache/flink-cdc

    Flink CDC is a streaming data integration tool

    Language:Java5.4k1441.7k1.8k
  • rudder-server

    rudderlabs/rudder-server

    Privacy and Security focused Segment-alternative, in Golang and React

    Language:Go4k61136297
  • quary

    quarylabs/quary

    Open-source BI for engineers

    Language:Rust2k123535
  • dlt-hub/dlt

    data load tool (dlt) is an open source Python library that makes data loading easy 🛠️

    Language:Python1.9k18394109
  • meltano/meltano

    Meltano: the declarative code-first data integration engine that powers your wildest data and ML-powered product ideas. Say goodbye to writing, maintaining, and scaling your own API integrations.

    Language:Python1.6k136.7k145
  • TobikoData/sqlmesh

    Efficient data transformation and modeling framework that is backwards compatible with dbt.

    Language:Python1.4k18434115
  • dataform-co/dataform

    Dataform is a framework for managing SQL based data operations in BigQuery

    Language:TypeScript79820474148
  • kuwala

    kuwala-io/kuwala

    Kuwala is the no-code data platform for BI analysts and engineers enabling you to build powerful analytics workflows. We are set out to bring state-of-the-art data engineering tools you love, such as Airbyte, dbt, or Great Expectations together in one intuitive interface built with React Flow. In addition we provide third-party data into data science models and products with a focus on geospatial data. Currently, the following data connectors are available worldwide: a) High-resolution demographics data b) Point of Interests from Open Street Map c) Google Popular Times

    Language:JavaScript774137252
  • raystack/optimus

    Optimus is an easy-to-use, reliable, and performant workflow orchestrator for data transformation, data modeling, pipelines, and data quality management.

    Language:Go73918268153
  • artie-labs/transfer

    Database replication platform that leverages change data capture. Stream production data from databases to your data warehouse (Snowflake, BigQuery, Redshift) in real-time.

    Language:Go54793525
  • automate-dv

    Datavault-UK/automate-dv

    A free to use dbt package for creating and loading Data Vault 2.0 compliant Data Warehouses (powered by dbt, an open source data engineering tool, registered trademark of dbt Labs)

  • gouline/dbt-metabase

    dbt + Metabase integration

    Language:Python433108463
  • versatile-data-kit

    vmware/versatile-data-kit

    One framework to develop, deploy and operate data workflows with Python and SQL.

    Language:Python4141694754
  • osalvador/ReplicaDB

    ReplicaDB is open source tool for database replication, designed for efficiently transferring bulk data between relational and non-relational databases

    Language:Java3712410294
  • astronomer/astro-sdk

    Astro SDK allows rapid and clean development of {Extract, Load, Transform} workflows using Python and SQL, powered by Apache Airflow.

    Language:Python3261282739
  • aws-samples/aws-etl-orchestrator

    A serverless architecture for orchestrating ETL jobs in arbitrarily-complex workflows using AWS Step Functions and AWS Lambda.

    Language:Python325397136
  • sling-cli

    slingdata-io/sling-cli

    Sling is a CLI tool that extracts data from a source storage/database and loads it in a target storage/database.

    Language:Go287620716
  • cuebook/cuelake

    Use SQL to build ELT pipelines on a data lakehouse.

    Language:JavaScript283122928
  • datacoves/dbt-coves

    CLI tool for dbt users to simplify creation of staging models (yml and sql) files

    Language:Python223105614
  • umitkaanusta/reddit-detective

    Play detective on Reddit: Discover political disinformation campaigns, secret influencers and more

    Language:Python20961814
  • DataRecce/recce

    Data Reconnaissance - pull request review tool for dbt projects

    Language:TypeScript1927483
  • airbytehq/PyAirbyte

    PyAirbyte brings the power of Airbyte to every Python developer.

    Language:Python160414316
  • airbyte_serverless

    unytics/airbyte_serverless

    Airbyte made simple (no UI, no database, no cluster)

    Language:Python122327
  • sayn

    173TECH/sayn

    Data processing and modelling framework for automating tasks (incl. Python & SQL transformations).

    Language:Python11774714
  • faros-ai/airbyte-connectors

    Airbyte connectors (sources & destinations) + Airbyte CDK for JavaScript/TypeScript

    Language:TypeScript1051210160
  • yokawasa/databricks-notebooks

    Collection of Sample Databricks Spark Notebooks ( mostly for Azure Databricks )

    Language:Jupyter Notebook799068
  • codeforkjeff/dbt-sqlite

    A SQLite adapter plugin for dbt (data build tool)

    Language:Python7443611
  • dbd

    zsvoboda/dbd

    dbd is a database prototyping tool that enables data analysts and engineers to quickly load and transform data in SQL databases.

    Language:Python56202