furqanshahid85-python/Corona-Dataset-Analysis-Using-PySpark
This module performs statistical analysis on the noval corona virus dataset. The dataset being used was last updated on May 02, 2020. The Module performs the following Functions: * Displays the statistics of input dataset * Reads data from csv files and stores the aggregated output in parquet format * Counts the Number of records for each country/region and provice/state * Lists max Cases for each country/region and provice/state * Lists max Deaths for each country/region and provice/state * List max Recoveries for each country/region and provice/state *
Python
No issues in this repository yet.