datawarehouse

There are 444 repositories under datawarehouse topic.

DataVault
14
Tilda
Language:Java13
Sales-Data-Warehouse
Language:HTML7
fabricks
Language:Python5
system-big-data-movies-fr
Language:Jupyter Notebook5
cortana-intelligence-customer360
This repository contains instructions and code to deploy a customer 360 profile solution on Azure stack using the Cortana Intelligence Suite.
Language:Python24
Datawarehouse
Fully dockerized Data Warehouse (DWH) using Airflow, dbt, PostgreSQL and dashboard using redash
Language:Jupyter Notebook23
RStoolKit
RStoolKit - A utility to perform a complete health check of your AWS RedShift Cluster
Language:PLSQL23
ETL-Project
The goal of this project is to illustrate Extract Transform Load (ETL) using Python and SQL. ETL is a process commonly done in computing, which takes raw data, cleans it and stores it for later use. The extraction phase targets and retrieves the data. Transform manipulates and cleans the data. Then load stores the data, typically in a data warehouse.
Language:Jupyter Notebook21
intelli-swift-core
Distributed, Column-oriented storage, Realtime analysis, High performance Database
Language:Java18
IMDB-DB-Dump-Projects
Taking IMDBs database dumps and turning them into a multiple projects
Language:TSQL18
DDO
A DBT package to perform DataOps & administrative CI/CD on your data warehouse.
16
data-brewery
Data Brewery is an ETL (Extract-Transform-Load) program that connect to many data sources (cloud services, databases, ...) and manage data warehouse workflow.
Language:Scala16
cobra-policytool
Manage Apache Atlas and Ranger configuration for your Hadoop environment.
Language:Python16
nifi-postgres-metabase
Template for creating batch based ETL workflow for datawarehouses
Language:PLpgSQL15
SparkETL
Implement a complete data warehouse etl using spark SQL
Language:Java14
Sentiment-analysis-from-MLOps-paradigm
This project promulgates an automated end-to-end ML pipeline that trains a biLSTM network for sentiment analysis, experiment tracking, benchmarking by model testing and evaluation, model transitioning to production followed by deployment into cloud instance via CI/CD
Language:Python13
AmazonMoviesDataWarehouse
数据仓库--存储并分析亚马逊历年电影数据
Language:Java13
data_ai_for_all
Data Analysis, Analytics, Science, AI & ML, LLM etc.
Language:Jupyter Notebook13
hephaestus
:stars: Hephaestus - ETL and ML tools for OHDSI - OMOP CDM
Language:Python13
MUST_HAVE_SKILLS
This repo consists of all important concepts for data engineers.
Language:Java11
Data-Modeling-with-Postgres
A project to design a fact and dimension star schema for optimizing queries on a flight booking database using PostgreSQL, a relational database management system. This schema is well-suited for a flight booking database, as it allows for efficient querying of data such as booking dates, flight routes, and passenger information.
Language:PLpgSQL10
Data-Warehouse-UKAccident
Information system for business project - building and mining data warehouse
Language:TSQL9
hexbase
open-source ETL pipeline for HEX cryptocurrency data
Language:Python9
vau
Data Vault data model and ETL generator for Oracle Databases
Language:Java9
Data-Warehouse-With-Redshift
Data Warehouse with AWS Redshift and Visualizing data using Power BI
Language:Jupyter Notebook8
DataManager
Better organize data in data lake and build ETL pipeline with Web UI tool.
Language:JavaScript8
GitHub-FabricDWDBProject
About Template to perform CI/CD for Microsoft Fabric Data Warehouses using GitHub Actions
Language:TSQL7
Modern-Big-Data-Analysis-using-SQL
RDBMS techniques for Big Data analysis
7
DateAndTimeDimensionBuilders
Data warehousing date dimension and time dimension builders written in Python.
Language:Python7
fabric-accelerator
Accelerator to build a Microsoft Fabric modern data platform using ELT Framework https://github.com/bennyaustin/elt-framework
Language:Python6
Data-Engineering-Project
The Centralized Data Warehouse and ML Solution for Banking Analytics is a project that combines a centralized repository for banking data with machine learning algorithms to enable predictive analysis.
Language:Jupyter Notebook6
CourseShop_DataWarehouse
a data warehouse for an online course shop
Language:TSQL6
data-warehouse
Practice leaning data warehouse
Language:TSQL6
Business-Intelligence-on-Big-Data-_-U-TAD-2017-Big-Data-Master-Final-Project
This is the final project I had to do to finish my Big Data Expert Program in U-TAD in September 2017. It uses the following technologies: Apache Spark v2.2.0, Python v2.7.3, Jupyter Notebook (PySpark), HDFS, Hive, Cloudera Impala, Cloudera HUE and Tableau.
Language:Jupyter Notebook6
ETL-Data-Pipeline-using-AirFlow
An ETL Data Pipelines Project that uses AirFlow DAGs to extract employees' data from PostgreSQL Schemas, load it in AWS Data Lake, Transform it with Python script, and Finally load it into SnowFlake Data warehouse using SCD type 2.
Language:Python5