
There are 70 repositories under etl-process topic.

  • imsanjoykb/Data-Science-Regular-Bootcamp

    Regular practice on Data Science, Machien Learning, Deep Learning, Solving ML Project problem, Analytical Issue. Regular boost up my knowledge. The goal is to help learner with learning resource on Data Science filed.

    Language:Jupyter Notebook104516540
  • taogeYT/pyetl

    python ETL framework

  • AndrejaCH/Movies-ETL

    For this project I am creating an ETL (Extract, Transform, and Load) pipeline using Python, RegEx, and SQL Database. The goal is to retrieve data from different sources, clean and transform it into a useful format and finally load the data into an SQL database where the data is ready for further analysis. The result is an established automated pipeline and a clean data set stored in an SQL database.

    Language:Jupyter Notebook25108
  • Wazzabeee/pyspark-etl-twitter

    Implementation of an ETL process for real-time sentiment analysis of tweets with Docker, Apache Kafka, Spark Streaming, MongoDB and Delta Lake

  • polakowo/yelp-3nf

    3NF-normalize Yelp data on S3 with Spark and load it into Redshift - automate the whole thing with Apache Airflow

    Language:Jupyter Notebook12303
  • source-watcher-core


    This is a PHP project which combines ETL with different strategies to extract data from multiple databases, files, and services, transform it and load it into multiple destinations.

  • Steve0verton/google-maps-geocode-enrichment

    This project repository provides a headless module to enrich location data in a database table using the Google Maps Geocode API.

  • thompson0012/PyEmits

    Sugar candy for data scientist. Easy manipulation in time-series data analytics works.

  • GhazaleZe/CourseShop_DataWarehouse

    a data warehouse for an online course shop

  • AleksaMCode/university-notices-email-notifier

    Dynamic website scraper and email notifier.

  • yessinek97/Satisfaction-Analysis-Solution-For-Phone-Service-Providers

    This is a sentimental analysis project that aims to provide a better insight on customers' satisfaction based on comments gathered (scrapped) from social media using google's Bert classification model.

    Language:Jupyter Notebook5202
  • polarbeargo/udacity-nd027-Data-Modeling-with-Postgres

    Udacity nd027 Data Modeling with Postgres

    Language:Jupyter Notebook4302
  • aymane-maghouti/HR-Data-Pipeline-Azure

    This project is a comprehensive data engineering solution that extracts HR data from a GitHub repository, performs data transformations using Azure services, and creates an interactive HR dashboard using Power BI. The goal is to enable HR professionals and decision-makers to gain insights from the HR data for better workforce management.

    Language:Jupyter Notebook3100
  • davideaimar/eth2dgraph

    Extractor of Ethereum data to Dgraph format, utilities to analyse the indexed data.

  • emsalcengiz/data-normalize-with-etl-procesess

    I made various data normalization operations with python scripts. Target data in CSV format

  • hmignon/P2_BooksToScrape

    Scraping BooksToScrape (P2 OC D-A Python) : Utiliser les bases de Python pour l'analyse de marché

  • The-Music-has-Changed-Extract-transform-load-


    We examine two data sets relate with the music Industry. We Extract, transform and load the data sets in order to create a data base and identify insides and trends about the music Industry.

    Language:Jupyter Notebook3100
  • caesarmario/data-warehouse-credit-card-applicant-using-pentaho

    This repository contains OLTP, ETL process (using Pentaho Data Integration), and OLAP of credit card dataset. The dataset is taken from Kaggle ( and part of author Capstone Project.

  • nickjlupu/Movies-ETL

    An ETL process for a fictitious streaming service, Amazing Prime, was developed in Jupyter Notebook. The code was then refactored into a Python script to automate the ETL process.

    Language:Jupyter Notebook2100
  • V-MalM/ETL

    A Case Study of Extract, Transform, Load. Documentaion includes sources of data, types of data wrangling performed (data cleaning, joining, filtering, and aggregating) and the schemata used in the final production database. Technologies used include Pandas, PostgreSQL, Jupyter Notebook.

    Language:Jupyter Notebook2000
  • bkfan1/spotify-user-analysis

    ETL process and EDA of user top artists & tracks data in Spotify using Spotipy, Pandas, Airflow and Seaborn

    Language:Jupyter Notebook110
  • DCF0708/Amazon_Vine_Analysis

    ETL and analysis of trends in product review data from Amazon Vine.

    Language:Jupyter Notebook1101
  • keity-p/Processo_de_ETL-_Projeto_Pix

    Processo de ELT da Análise Exploratória de Dados sobre Pix.

    Language:Jupyter Notebook1100
  • The-Music-has-Changed-WEBSIDE


    We going to examine two data sets relate with the music Industry. We want Extract, transform and load this in order to identify insides and trend about the music Industry.

  • SAZZAD-AMT/Informatica-Data-Integration-and-Transformation-Project

    This process illustrates how to structure and manipulate relational databases effectively, demonstrating key SQL operations and transformations within an Informatica environment. The provided images and detailed SQL commands serve as a comprehensive guide for implementing and understanding these database management tasks.

  • ScuderiRosario/CryptoMundo

    CryptoMundo is a simple and easy tool to analyze cryptocurrency data in real time which provides a simple and informative dashboard.

    Language:Jupyter Notebook1101
  • seyedmahdiamin1998/ETL_catawiki

    ETL : Extract --> transform --> load

  • benkeyben/candidalysis

    Candidalysis is a project aimed at analyzing student performance for academic year 2022 to 2023 using Power BI. The primary goal of this project is to extract, visualize, and interpret various key performance indicators (KPIs) related to exams conducted during this period.

  • bhammy27/Fantasy_Football_database_SQL

    A desire to win my Fantasy Football leagues led to a realization that I have a passion for Data Analytics. I will create my own database using postgreSQL and pgAdmin.

  • izouazou/AirQuality-ETL

    Air Quality ETL is a Python repository facilitating the extraction, transformation, and loading of air quality data from RapidAPI to a Pandas DataFrame for easy analysis and customization.

  • JanviSK/Amazon-Global-Sales-Dashboard

    This project performs data analysis on Amazon Global Sales using Power BI. It implements data preprocessing, data cleaning, Power Query and data visualization.

  • simonediluna/Laboratory-of-Data-Science

    This repository showcases my university "Laboratory of Data Science" project. It encompasses the implementation of a data warehouse, ETL process, Data Cube, MDX queries, and an interactive dashboard.

  • Tarun-Gemini/Netflix_Shows_Pbix

    Netflix users insights through data visualization - Power BI

  • TeniOT/HR-Employee-Report

    Dataset cleaned and queried to visualisation for HR Employee data report. Skills: PowerBI, MySQL, EDA, ETL

  • pacicap/Data-Warehousing

    Extraction of data from different Database sources, Transformation (unification and cleaning) of extracted data and laoding into the data warehouse

  • pzaino/microETL

    A simple, reusable, templates based ETL (Extract, Transform and Load) library and framework written in Python
