data-warehouse
There are 535 repositories under data-warehouse topic.
PostHog/posthog
🦔 PostHog provides open-source web & product analytics, session recording, feature flagging and A/B testing that you can self-host. Get started - free.
oxnr/awesome-bigdata
A curated list of awesome big data frameworks, ressources and other awesomeness.
rudderlabs/rudder-server
Privacy and Security focused Segment-alternative, in Golang and React
hydradatabase/hydra
Hydra: Column-oriented Postgres. Add scalable analytics to your project in minutes.
dlt-hub/dlt
data load tool (dlt) is an open source Python library that makes data loading easy 🛠️
BlankerL/DXY-COVID-19-Data
2019新型冠状病毒疫情时间序列数据仓库 | COVID-19/2019-nCoV Infection Time Series Data Warehouse
elementary-data/elementary
The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.
Multiwoven/multiwoven
🔥🔥🔥 Open Source Alternative to Hightouch, Census, and RudderStack - Reverse ETL & Data Activation
san089/Udacity-Data-Engineering-Projects
Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.
DataBrewery/cubes
[NOT MAINTAINED] Light-weight Python OLAP framework for multi-dimensional data analysis
tensorbase/tensorbase
TensorBase is a new big data warehousing with modern efforts.
cloudera/hue
Open source SQL Query Assistant service for Databases/Warehouses
BemiHQ/BemiDB
Postgres read replica optimized for analytics
GoogleCloudPlatform/bigquery-utils
Useful scripts, udfs, views, and other utilities for migration and data warehouse operations in BigQuery.
scratchdata/scratchdata
Scratch is a swiss army knife for big data.
alanchn31/Data-Engineering-Projects
Personal Data Engineering Projects
apache/cloudberry
One advanced and mature open-source MPP (Massively Parallel Processing) database. Open source alternative to Greenplum Database.
raystack/optimus
Optimus is an easy-to-use, reliable, and performant workflow orchestrator for data transformation, data modeling, pipelines, and data quality management.
Canner/vulcan-sql
Data API Framework for AI Agents and Data Apps
unytics/bigfunctions
Supercharge BigQuery with BigFunctions
domainmod/domainmod
DomainMOD is an open source application written in PHP & MySQL used to manage your domains and other internet assets in a central location. DomainMOD also includes a Data Warehouse framework that allows you to import your web server data so that you can view, export, and report on your live data.
Titan-Systems/titan
Titan Core - Snowflake infrastructure-as-code. Provision environments, automate deploys, CI/CD. Manage RBAC, users, roles, and data access. Declarative Python Resource API. Change Management tool for the Snowflake data warehouse.
vmware/versatile-data-kit
One framework to develop, deploy and operate data workflows with Python and SQL.
intermine/intermine
A powerful open source data warehouse system
ubisoft/mobydq
:whale: Tool to automate data quality checks on data pipelines
GokuMohandas/data-engineering
Construct a modern data stack and orchestration the workflows to create high quality data for analytics and ML applications.
tuva-health/tuva
Main repo including core data model, data marts, reference data, terminology, and the clinical concept library
data-engineering-community/data-engineering-project-template
This is a template you can use for your next data engineering portfolio project.
pracdata/awesome-open-source-data-engineering
A curated list of open source tools used in analytics platforms and data engineering ecosystem
dalenewman/Transformalize
Configurable Extract, Transform, and Load
google/space
Unified storage framework for the entire machine learning lifecycle
unytics/airbyte_serverless
Airbyte made simple (no UI, no database, no cluster)
Canner/wren-engine
🤖 The semantic engine for LLMs, bringing semantic context to AI agents. 🔥
pixelsdb/pixels
An efficient storage and compute engine for both on-prem and cloud-native data analytics.
iam-mhaseeb/Skytrax-Data-Warehouse
A full data warehouse infrastructure with ETL pipelines running inside docker on Apache Airflow for data orchestration, AWS Redshift for cloud data warehouse and Metabase to serve the needs of data visualizations such as analytical dashboards.
Rello/analytics
Analytics - Open source data warehouse and reporting for Nextcloud