Utilizing ML algorithms coupled with data visualization on Spotify Dataset (kNN, Regression, FCN).
The goal of this project is to better analyse the Spotify data to employ practical analytical techniques on two critical levels for producers and consumers. While the prediction model serves as an indicator of the popularity of the song produced by the artist, the consumers can benefit from the music recommendation engine which recommends similar songs to the user.
The Dataset was picked up from Kaggle : https://www.kaggle.com/yamaerenay/spotify-dataset-19212020-160k-tracks
This dataset includes the following files in the directory /data:
a) data.csv : Main data file containing data about 160k+ genres
b) data_by_artists.csv : Aforementioned data.csv grouped by artists
c) data_by_genre.csv : Aforementioned data.csv grouped by genre
d) data_by_year.csv : Aforementioned data.csv grouped by year
e) data_w_genres.csv : Aforementioned data.csv with genre implementation for each artist
f) Spotify_Analyzed_Presentation.pdf : A powerpoint presentation summarizing our efforts.
g) Spotify_Analyzed_Project_Report.pdf : A detailed description of our proceedings and analysis of the dataset.
h) code/Spotify_Data_Analysis.ipynb : Python notebook containing code.
i) code/genre_classification_NN_approach.ipynb : Experimentation on genre classification using Neural Network
j) requirements.txt : A list of python packages that are required to run our project.
This project consists of python notebooks. The code base can be accessed from the directory /code. You may load these notebooks on Jupyter or Google Colab.
- Install the libraries from requirements.txt
- Open notebook in Jupyter Notebook
- Run cells from cell 2 onwards (skip google.colab cell)
- Open notebook in Google Colab
- Run all cells
Aditya Khopkar
Grusha Mehrotra
Sukoon Sarin