/Team-Epsilon

Group Project for UNCC Data Analytics Bootcamp

Primary LanguageJupyter Notebook

Team-Epsilon Extract Transform Load Project

Aaron Sell, Nolan Simmons, Ange Ndjeka, Oluwaseun Orepekan

This project demonstrates the ETL (Extract, Transform, Load) process using a dataset from Kaggle, this being Valve's Steam Games info. The process includes extracting data from a zipped CSV file, transforming the data in a Jupyter Notebook, and loading the processed data into a MongoDB database. https://www.kaggle.com/datasets/fronkongames/steam-games-dataset?resource=download

Directories

We have two primary directories: jupyterNotebook and resources.

Jupyter Notebook

JupyterNotebook contains test code, aswell as the primary code that does all of the functions, that primary code document is SteamGamesData.ipynb

Resources

This contains a sub-folder, and a zip document. The zip document, games_info_clean.csv.zip, contains the raw data. cleanData contains the data we cleaned; That file is called rated_games.csv

Conclusions

We had fun with the project, and since this was just ETL, we didn't get to find any interesting insights, but we were able to do some cleaning. Hopefully this readme is clear and concise.