datawarehouse
There are 444 repositories under datawarehouse topic.
cortana-intelligence-customer360
This repository contains instructions and code to deploy a customer 360 profile solution on Azure stack using the Cortana Intelligence Suite.
Datawarehouse
Fully dockerized Data Warehouse (DWH) using Airflow, dbt, PostgreSQL and dashboard using redash
RStoolKit
RStoolKit - A utility to perform a complete health check of your AWS RedShift Cluster
ETL-Project
The goal of this project is to illustrate Extract Transform Load (ETL) using Python and SQL. ETL is a process commonly done in computing, which takes raw data, cleans it and stores it for later use. The extraction phase targets and retrieves the data. Transform manipulates and cleans the data. Then load stores the data, typically in a data warehouse.
intelli-swift-core
Distributed, Column-oriented storage, Realtime analysis, High performance Database
IMDB-DB-Dump-Projects
Taking IMDBs database dumps and turning them into a multiple projects
DDO
A DBT package to perform DataOps & administrative CI/CD on your data warehouse.
data-brewery
Data Brewery is an ETL (Extract-Transform-Load) program that connect to many data sources (cloud services, databases, ...) and manage data warehouse workflow.
cobra-policytool
Manage Apache Atlas and Ranger configuration for your Hadoop environment.
nifi-postgres-metabase
Template for creating batch based ETL workflow for datawarehouses
SparkETL
Implement a complete data warehouse etl using spark SQL
Sentiment-analysis-from-MLOps-paradigm
This project promulgates an automated end-to-end ML pipeline that trains a biLSTM network for sentiment analysis, experiment tracking, benchmarking by model testing and evaluation, model transitioning to production followed by deployment into cloud instance via CI/CD
AmazonMoviesDataWarehouse
数据仓库--存储并分析亚马逊历年电影数据
data_ai_for_all
Data Analysis, Analytics, Science, AI & ML, LLM etc.
hephaestus
:stars: Hephaestus - ETL and ML tools for OHDSI - OMOP CDM
MUST_HAVE_SKILLS
This repo consists of all important concepts for data engineers.
Data-Modeling-with-Postgres
A project to design a fact and dimension star schema for optimizing queries on a flight booking database using PostgreSQL, a relational database management system. This schema is well-suited for a flight booking database, as it allows for efficient querying of data such as booking dates, flight routes, and passenger information.
Data-Warehouse-UKAccident
Information system for business project - building and mining data warehouse
hexbase
open-source ETL pipeline for HEX cryptocurrency data
vau
Data Vault data model and ETL generator for Oracle Databases
Data-Warehouse-With-Redshift
Data Warehouse with AWS Redshift and Visualizing data using Power BI
DataManager
Better organize data in data lake and build ETL pipeline with Web UI tool.
GitHub-FabricDWDBProject
About Template to perform CI/CD for Microsoft Fabric Data Warehouses using GitHub Actions
Modern-Big-Data-Analysis-using-SQL
RDBMS techniques for Big Data analysis
DateAndTimeDimensionBuilders
Data warehousing date dimension and time dimension builders written in Python.
fabric-accelerator
Accelerator to build a Microsoft Fabric modern data platform using ELT Framework https://github.com/bennyaustin/elt-framework
Data-Engineering-Project
The Centralized Data Warehouse and ML Solution for Banking Analytics is a project that combines a centralized repository for banking data with machine learning algorithms to enable predictive analysis.
CourseShop_DataWarehouse
a data warehouse for an online course shop
data-warehouse
Practice leaning data warehouse
Business-Intelligence-on-Big-Data-_-U-TAD-2017-Big-Data-Master-Final-Project
This is the final project I had to do to finish my Big Data Expert Program in U-TAD in September 2017. It uses the following technologies: Apache Spark v2.2.0, Python v2.7.3, Jupyter Notebook (PySpark), HDFS, Hive, Cloudera Impala, Cloudera HUE and Tableau.
ETL-Data-Pipeline-using-AirFlow
An ETL Data Pipelines Project that uses AirFlow DAGs to extract employees' data from PostgreSQL Schemas, load it in AWS Data Lake, Transform it with Python script, and Finally load it into SnowFlake Data warehouse using SCD type 2.