/exploratory-data-analysis-on-amazon-review-data-using-mongodb-and-pyspark

This repository showcases the outcomes of an Exploratory Data Analysis (EDA), including visualisation, conducted on the comprehensive Amazon Review Data (2018) dataset, consisting of nearly 233.1 million records and occupying approximately 128 gigabytes (GB) of data storage, using MongoDB and PySpark.

Primary LanguageJupyter NotebookBSD 3-Clause "New" or "Revised" LicenseBSD-3-Clause

Watchers