This is an exploratory data analysis done on various statistics derived from metacritic with an emphasis on genres and individual critic scoring. Data set was gathered via scraper.py of my own design which stores a raw.csv. This csv is then formatted and cleaned in sanitizer.py to be exported as an optimized parquet file. Data set (currently incomplete) last updated on: 2/23/2023. The final data set is of the form:
general | genres | critics | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
title | platform | release_date | metascore | userscore | Action | ... | Fantasy | Gameshark | ... | Eurogamer | |
0 | str | cat | datetime | int8 | float | bool | bool | int8 | int8 |
All required modules for scraping, cleaning, and analyzing can be installed with pip install -r requirements.txt
.
Subsequent report based on data findings is stored in analysis_results folder along with any figures generated from analysis.ipynb and other misc. resources used.