data-lakehouse
There are 17 repositories under data-lakehouse topic.
Qbeast-io/qbeast-spark
Qbeast-spark: DataSource enabling multi-dimensional indexing and efficient data sampling. Big Data, free from the unnecessary!
pracdata/awesome-open-source-data-engineering
A curated list of open source tools used in analytical stacks and data engineering ecosystem
dominikhei/Local-Data-LakeHouse
Sample Data Lakehouse deployed in Docker containers using Apache Iceberg, Minio, Trino and a Hive Metastore. Can be used for local testing.
aabouzaid/modern-data-platform-poc
My M.Sc. dissertation: Modern Data Platform using DataOps, Kubernetes, and Cloud-Native ecosystem to build a resilient Big Data platform based on Data Lakehouse architecture which is the base for Machine Learning (MLOps) and Artificial Intelligence (AIOps).
gupta-aayushkr/F1-Racing
The project aims to process Formula 1 racing data, create an automated data pipeline, and make the data available for presentation and analysis purposes.
mahmoudparsian/data-warehousing
This repository is a place for the Data Warehousing course at the Information Systems & Analytics department, Santa Clara University.
firelink-data/evolution
🦖 Efficiently evolve your old fixed-length data files into more modern file formats, fully parallelized!
prneidhardt/AWS-Data-Lakehouse
STEDI project
sudohainguyen/mini-lakehouse
Data lakehouse at home with docker compose
ananyacanakapalli/University-Data-Design
This project is aimed at overhauling a university's data infrastructure to improve efficiency, security, and scalability, resulting in the successful creation of a unified data management solution.
Data-Kube/tst-datalakehouse-hudi
#Test - Create a Data Lakehouse in Kubernetes
eavilaes/qbeast-spark
Qbeast-spark: DataSource enabling multi-dimensional indexing and efficient data sampling. Big Data, free from the unnecessary!
huwngnosleep/complete_lakehouse_techstack
This project implements an end-to-end techstack for a data platform, for local development.
k0rsakov/all_about_DuckDB
Всё что нужно знать про DuckDB
k0rsakov/infrastructure_for_data_engineer_S3
Инфраструктура для data engineer S3
THeades/serverless-data-lakehouse
This is an example project how to build a serverless data lakehouse on AWS using Terraform, Apache Iceberg and Spark.