extract-transform-load

There are 91 repositories under extract-transform-load topic.

  • etl-project-anteraja-reviews

    This repository is created for final group project on Data Engineering course.

    Language:Jupyter Notebook
  • DE_ETL_HTML_CSV_JSON

    This notebook scrapes information about the largest banks by market capitalization from a wiki page, and stores the information both as a CSV and as a JSON file.

    Language:Jupyter Notebook
  • Movies-ETL

    Created an automated pipeline that takes in new data from a movie set. Performed the appropriate transformations, and loaded the data into existing tables. Performed the ETL process by adding the data to a PostgreSQL database.

    Language:Jupyter Notebook
  • Udacity_Capstone-ETL_Pipeline

    Udacity Data Engineering Capstone project

    Language:Python
  • Car-ETL

    This project utilized four sources of data to analyze information about characteristics of automobiles and the car buying process. This database could be useful in the car buying and selling process for both dealerships and private consumers.

    Language:Jupyter Notebook
  • ncov-db

    Store SARS-CoV-2 genomic analysis results from ncov2019-artic-nf and ncov-tools to a sqlite DB

    Language:Python
  • Movies-ETL

    The Movies Extract-Transform-Load (ETL) Analysis repo contains movie data extracted from Wikipedia and Kaggle in CSV and JSON file formats. The datasets were transformed by cleaning and merging the datasets, and the cleaned datasets were loaded into a movie_data SQL database. Regex was used to identify strings of characters defined by search patterns playing a critical role in cleaning the box office, budget, release date, and running time data. Lambda functions were used in the transform phase as "anonymous functions."

    Language:Jupyter Notebook
  • Extract_transform_Load

    The purpose of this project is to extract, transform & load datasets into a database in pgAdmin while providing step by step instructions for users to follow.decided to observe active COVID-19 cases across the world in relation to continued vaccination efforts running from January 1, 2021 to March 21, 2021. We have successfully extracted, transformed, & loaded this data utilizing csv files, Python in Jupyter Notebook and a SQL database.

    Language:Jupyter Notebook
  • Movies-ETL

    Fictional company Amazing Prime needs to create an automated pipeline that takes in new data, performs the appropriate transformations, and loads the data into existing tables. Code will be refactored to create one function that takes in the three files and performs the ETL process by adding the data to a PostgreSQL database.

    Language:Jupyter Notebook
  • Data-Preparation

    Among the beginning steps for Data Analyis, Data Preparation plays an important role to have clean, error free, clear formatted dataset to train/test the model on.

    Language:Python
  • Movies-ETL

    Use Extract, Transform, Load (ETL) process on several movie datasets to create data pipelines and predict popular films.

    Language:Jupyter Notebook
  • dcpam

    Data Construct-Populate-Access-Manage - Open source data warehouse solution.

    Language:C
  • Crime-Anyltics

    Approximately 10 people are shot on an average day in Chicago. This project focuses on Poverty and Crime in Chicago Neighborhoods. Full-Stack Project.

    Language:Jupyter Notebook
  • Citi-Bike-Analytics

    An analysis of Citi Bike with Tableau from January 2018 - September 2019

  • ETL-Project

    E (Extract), T (Transform), L (Load) Project that showcases both SQL and No-SQL Databases.

    Language:Jupyter Notebook
  • etlrun

    Extract-Transform-Load tool based on Message passing, self reprocessing XML pipeline

    Language:Perl