/data512-final-project

Final Project for UW DATA 512: Human-Centered Data Science (Fall 2019)

Primary LanguageJupyter NotebookMIT LicenseMIT

Music Through The Ages: Billboard Top 100 Analysis

Abstract

I analyzed Billboard Hot 100 Weekly Charts from 1958 to 2019 to show the relationship between popularity vs. relevance, velocity climb to #1 vs. #1 streak, as well as Billboard peak position vs. corresponding YouTube music video views. I show that songs from the 2010s are not outperforming the songs from the past, and confirmed that YouTube views do have correlations with higher Billboard peak positions, but noted that is already a confounding factor.

Refer to analysis.ipynb for the comprehensive analysis, code, and discussion. Please note that I am using NBViewer in order to render the interactive visualizations I built using Plotly.

This project was completed as part of the final project for DATA 512 (Human-Centered Data Science), University of Washington, Fall 2019.

Reproducibility

This work is intended to be fully reproducible. Anyone should be able to run my code and produce the exact results as I have presented here. To try it out for yourself, please clone this repository:

git clone https://github.com/kfrankc/data512-final-project

The code repository has the following dependencies:

After installing dependencies, run command jupyter notebook, which should bring you to a localhost environment where you can click on the analysis/ folder and click on analysis.ipynb, which is the Jupyter notebook you can follow for both my analysis and code. Ignore the tmp_data folder, as it contains all the temporary .csv files I generate throughout the notebook.

Feel free to contact me at kfrankc [at] uw edu if you have any questions about this analysis.

License for the Data

Both the Billboards and YouTube data are shared with CC0 license. The links to the datasets websites can be found below. I have also added the two datasets in the raw_data folder; they are named hot_100.csv and yt_us_videos.csv respectively.

Relevant Links

Pandas Documentation

Plotly Documentation

Billboard Data

YouTube Data

Course Wiki

Repository License