etl-framework
There are 232 repositories under etl-framework topic.
pathwaycom/pathway
Python ETL framework for stream processing, real-time analytics, LLM pipelines, and RAG.
elastic/logstash
Logstash - transport and process your logs, events, or other data
cloudquery/cloudquery
The open source ELT framework powered by Apache Arrow
noflo/noflo
Flow-based programming for JavaScript
apache/hamilton
Apache Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.
san089/goodreads_etl_pipeline
An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
singer-io/getting-started
This repository is a getting started guide to Singer.
marsupialtail/quokka
Making data lake work for time series
stitchfix/hamilton
A scalable general purpose micro-framework for defining dataflows. THIS REPOSITORY HAS BEEN MOVED TO www.github.com/dagworks-inc/hamilton
Cinchoo/ChoETL
ETL framework for .NET (Parser / Writer for CSV, Flat, Xml, JSON, Key-Value, Parquet, Yaml, Avro formatted files)
apache/seatunnel-web
SeaTunnel is a distributed, high-performance data integration platform for the synchronization and transformation of massive data (offline & real-time).
YotpoLtd/metorikku
A simplified, lightweight ETL Framework based on Apache Spark
seanharr11/etlalchemy
Extract, Transform, Load: Any SQL Database in 4 lines of Code.
usc-isi-i2/kgtk
Knowledge Graph Toolkit
quintoandar/butterfree
A tool for building feature stores.
data-dot-all/dataall
A modern data marketplace that makes collaboration among diverse users (like business, analysts and engineers) easier, increasing efficiency and agility in data projects on AWS.
Nextdoor/bender
Bender - Serverless ETL Framework
velocitybolt/open-extract
Structured Data Extractor for AI Agents. Search your documents or the web for specific data and get it back in JSON or Markdown in a single tool call.
ceumicrodata/mETL
mito ETL tool
dalenewman/Transformalize
Configurable Extract, Transform, and Load
BitwiseInc/Hydrograph
A visual ETL development and debugging tool for big data
halestudio/hale
(Spatial) data harmonisation with hale»studio (formerly HUMBOLDT Alignment Editor)
duoan/OpenKettleWebUI
一款基于kettle的数据处理web调度控制平台,支持文档资源库和数据库资源库,通过web平台控制kettle数据转换,可作为中间件集成到现有系统中
frankframework/frankframework
The Frank!Framework is an easy-to-use, stateless integration framework which allows (transactional) messages to be modified and exchanged between different systems.
globalbioticinteractions/globalbioticinteractions
Global Biotic Interactions provides access to existing species interaction datasets
patterns-app/patterns-devkit
Data pipelines from re-usable components
usc-isi-i2/dig-etl-engine
Download DIG to run on your laptop or server.
restarone/violet_rails
an app engine for your business. Seamlessly implement business logic with a powerful API. Out of the box CMS, blog, forum and email functionality. Developer friendly & easily extendable for your next SaaS/XaaS project. Built with Rails 6, Devise, Sidekiq & PostgreSQL
ContextData/VectorETL
Build super simple end-to-end data & ETL pipelines for your vector databases and Generative AI applications
geopython/stetl
Stetl, Streaming ETL, is a lightweight geospatial processing and ETL framework written in Python.
socialpoint-labs/sqlbucket
Lightweight library to write, orchestrate and test your SQL ETL. Writing ETL with data integrity in mind.
harry-s-grewal/mls-real-estate-scraper-for-realtor.ca
Python MLS and Real-Estate Data Scraper for the Realtor.ca Website
ylem-co/ylem
Ylem is an open-source platform for real-time data streaming orchestration
maxim2266/csvplus
csvplus extends the standard Go encoding/csv package with fluent interface, lazy stream operations, indices and joins.
MassStreetAnalytics/etl-framework
A framework for moving data into a data warehouse.
vim89/datapipelines-essentials-python
Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations