/song-recommendation

song recommendation using spotify dataset

Primary LanguageJupyter Notebook

Song Recommendation using cosine similarity score

This project is one of the assessment for Machine Learning Foundation course for my B.Tech degree.

In this project I have used spotify song dataset to generate recommendations

Dataset can be obtained by visiting my kaggle

To use python notebook yourself provide your own credentials for spotify api in a text file. Also, do change the path to all the files that are being accessed.

For generating recommendations the notebook runs for approx. 3 minutes per playlist.

Note: Spotify playlist provided by user must be a public playlist

Approach

Content based recommendation engines works on the data provided by user, in this case it is the playlist provided by the user.

I have used various field that describe audio for the songs like valence, acousticness, liveness, energy, loudness etc and attributes like genres and popularity.

For genres to be sensible from which machine could learn I have used TFIDF vectorizer to convert it into document matrix from list like object. Categoriacal features like popularity and year are one hot encoded using pd.get_dummies function

Why not euclidean distance as a similarity score metric?

Euclidean distance doesn't consider direction of vector, it only considers distance for giving similarity score, while cosine similarity considers angle between the two vectors while giving similarity score.

euclidean distance vs cosine similarity graph