elt

There are 314 repositories under elt topic.

  • airflow

    apache/airflow

    Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

    Language:Python38.2k76410.4k14.5k
  • airbytehq/airbyte

    The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

    Language:Python16.7k18614.8k4.2k
  • apache/doris

    Apache Doris is an easy-to-use, high performance and unified analytics database.

    Language:Java13k2867.6k3.3k
  • dbt-core

    dbt-labs/dbt-core

    dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.

    Language:Python10.2k1445.6k1.6k
  • apache/seatunnel

    SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.

    Language:Java8.2k1753.6k1.9k
  • mage-ai/mage-ai

    🧙 Build, run, and manage data pipelines for integrating and transforming data.

    Language:Python8.1k62898781
  • cloudquery

    cloudquery/cloudquery

    The open source high performance ELT framework powered by Apache Arrow

    Language:Go5.9k652.2k513
  • apache/flink-cdc

    Flink CDC is a streaming data integration tool

    Language:Java5.9k1331.7k2k
  • rudder-server

    rudderlabs/rudder-server

    Privacy and Security focused Segment-alternative, in Golang and React

    Language:Go4.1k62143319
  • dlt-hub/dlt

    data load tool (dlt) is an open source Python library that makes data loading easy 🛠️

    Language:Python2.9k18679190
  • quary

    quarylabs/quary

    Open-source BI for engineers

    Language:Rust2.2k134551
  • TobikoData/sqlmesh

    Efficient data transformation and modeling framework that is backwards compatible with dbt.

    Language:Python1.9k27617173
  • meltano/meltano

    Meltano: the declarative code-first data integration engine that powers your wildest data and ML-powered product ideas. Say goodbye to writing, maintaining, and scaling your own API integrations.

    Language:Python1.9k146.7k167
  • ucbepic/docetl

    A system for agentic LLM-powered data processing and ETL

    Language:Python1.4k1586126
  • dataform-co/dataform

    Dataform is a framework for managing SQL based data operations in BigQuery

    Language:TypeScript86327515167
  • kuwala

    kuwala-io/kuwala

    Kuwala is the no-code data platform for BI analysts and engineers enabling you to build powerful analytics workflows. We are set out to bring state-of-the-art data engineering tools you love, such as Airbyte, dbt, or Great Expectations together in one intuitive interface built with React Flow. In addition we provide third-party data into data science models and products with a focus on geospatial data. Currently, the following data connectors are available worldwide: a) High-resolution demographics data b) Point of Interests from Open Street Map c) Google Popular Times

    Language:JavaScript791137254
  • raystack/optimus

    Optimus is an easy-to-use, reliable, and performant workflow orchestrator for data transformation, data modeling, pipelines, and data quality management.

    Language:Go74717268154
  • artie-labs/transfer

    Database replication platform that leverages change data capture. Stream production data from databases to your data warehouse (Snowflake, BigQuery, Redshift, Databricks) in real-time.

    Language:Go61294330
  • automate-dv

    Datavault-UK/automate-dv

    A free to use dbt package for creating and loading Data Vault 2.0 compliant Data Warehouses (powered by dbt, an open source data engineering tool, registered trademark of dbt Labs)

  • gouline/dbt-metabase

    dbt + Metabase integration

    Language:Python480109574
  • sling-cli

    slingdata-io/sling-cli

    Sling is a CLI tool that extracts data from a source storage/database and loads it in a target storage/database.

    Language:Go4711231335
  • versatile-data-kit

    vmware/versatile-data-kit

    One framework to develop, deploy and operate data workflows with Python and SQL.

    Language:Python4351794756
  • osalvador/ReplicaDB

    ReplicaDB is open source tool for database replication, designed for efficiently transferring bulk data between relational and non-relational databases

    Language:Java4222611096
  • astronomer/astro-sdk

    Astro SDK allows rapid and clean development of {Extract, Load, Transform} workflows using Python and SQL, powered by Apache Airflow.

    Language:Python3591283247
  • aws-samples/aws-etl-orchestrator

    A serverless architecture for orchestrating ETL jobs in arbitrarily-complex workflows using AWS Step Functions and AWS Lambda.

    Language:Python333397138
  • DataRecce/recce

    The data-validation toolkit for enhanced dbt (data build tool) PR review

    Language:TypeScript2917777
  • cuebook/cuelake

    Use SQL to build ELT pipelines on a data lakehouse.

    Language:JavaScript284122928
  • datacoves/dbt-coves

    CLI tool for dbt users to simplify creation of staging models (yml and sql) files

    Language:Python254105915
  • airbytehq/PyAirbyte

    PyAirbyte brings the power of Airbyte to every Python developer.

    Language:Python240320742
  • umitkaanusta/reddit-detective

    Play detective on Reddit: Discover political disinformation campaigns, secret influencers and more

    Language:Python21361815
  • airbyte_serverless

    unytics/airbyte_serverless

    Airbyte made simple (no UI, no database, no cluster)

    Language:Python153328
  • sayn

    173TECH/sayn

    Data processing and modelling framework for automating tasks (incl. Python & SQL transformations).

    Language:Python12174715
  • faros-ai/airbyte-connectors

    Airbyte connectors (sources & destinations) + Airbyte CDK for JavaScript/TypeScript

    Language:TypeScript1111310362
  • yokawasa/databricks-notebooks

    Collection of Sample Databricks Spark Notebooks ( mostly for Azure Databricks )

    Language:Jupyter Notebook839073
  • codeforkjeff/dbt-sqlite

    A SQLite adapter plugin for dbt (data build tool)

    Language:Python7753613
  • dbd

    zsvoboda/dbd

    dbd is a database prototyping tool that enables data analysts and engineers to quickly load and transform data in SQL databases.

    Language:Python57202