Course Project for CS 6242: Built a Tableau UI for song recommendation based on collaborative filtering with Python Backend and TabPy interface; used listening session histories of 1K users and 100k songs to create user-song interaction matrix
There are state of the art song recommendation systems available such as Spotify, Apple Music, Amazon Music etc. but they lack in 2 key areas: 1) They fail to provide any explainability on why a particular song was recommended. 2) These recommendation system do not provide users with tools to control the recommendations . Through our product, we aim to address these challenges.
We used the lastfm dataset. This dataset contains user, timestamp, artist, song tuples collected from Last.fm API, using the user.getRecentTracks() method.This dataset represents the whole listening habits (till May, 5th 2009) for nearly 1,000 users.
userid \t timestamp \t musicbrainz-artist-id \t artist-name \t musicbrainz-track-id \t track-name
Element | Statistic |
---|---|
Total Lines | 19,150,868 |
Unique Users | 992 |
Artists with MBID | 107,528 |
Artists without MBDID | 69,420 |
This data contains time based user-song interaction information. We will utilize this information to create a user-song interaction rating matrix. Rating is a score in range 1-5 that suggests how much a user likes the song. We developed this rating by calculating monthly song frequency and inverse song frequency. Monthly song frequency(TF) is defined as the number of times a user has listened to a song in a month and inverse song frequency(IDF) is defined as a function of the number of users who have listened to that song in that particular month. So overall, the rating for that month is given as:
where
This is done so that we capture niche user tastes in our song recommendation and songs that are listened by relatively less users is given higher weightage. Further, to capture temporal relationship, we give relatively higher weightage to recent songs i.e. most recent month is given weight as 1, second recent month is given weight as (23/24) and so on.
Further, we extracted following data-sets from Spo- tify’s Web API: 1) New user’s song data, 2) MLHD Song’s attributes, and 3) Top 500 songs of 2022 and Top 500 songs of All time playlist’s songs and their attributes
In this technique, we can base our algorithm either on user-user or on song-song similarity. We identify a set of closest neighbors for a given user i (identified through the Pearson correlation coefficient) based on their ratings for common songs. Then, we take the weighted average of the ratings that these neighboring users give to a song j in order to come up with a predicted rating r(i,j) for a given user i for the song j. Also , to account for user bias, we compute the same on deviations of ratings around the mean for a given user.
The process involved taking the spotify user id as input and getting data output for developing tableau dashboard. It can be divided into the following steps:
- Taking user Spotify user-id and pulling all their playlists and extracting all songs
- Mapping user genre affinity.
- Identify genre representative songs.
- Creating user song matrix for a new user that is used as an input for collaborative filtering algorithm.
- Generating personalised playlist using recommendation engine.
- Generating recommendations from "Top 500 songs of 2022" and "Top 500 songs of All Time" playlists by mapping ranked output to the mentioned playlist using cosine similarity. As user has intrinsic preferences towards a type of song, we exploited this assumption to map recommended songs to out of corpus songs using content based similarity.
The data pipeline produces 3 output files that are used as an input for Tableau dashboard.
The final product is a Tableau dashboard that uses Data workflow output. It shows the personalised recommended playlist and graphical representation of music attributes. Further, it provide tools to control the recommendations and dynamically update the playlist as per user's mood.
The user interface has the following sections:
- Input field to take user’s spotify id.
- Playlist panel that has recommended songs along with their respective ranks
- Graphical representation of recommended playlist’s song attributes.
- Mood Control bar to control and change the recommendations as per user’s mood.
- "How recommendation works?" button to describe and visualise how recommendations work.
The mood control section has the following toggles:
- Choose Your Music: Selecting recommendation type from 3 options
- Feeling Nostalgic
- Best of 2022
- Best of All Times
- Danceability, Energy, Instrumentalness, Liveness, Song in Minutes: Length of a song in minutes.
The "How recommendation works?" button showcases the explainability aspect of the product. It visually explains the concept of collaborative filtering and visualizes user-neighbour interactions.
Summarizing the above sections, we introduced following innovation through our product :
- Defining rating matrix using song listening history of users.
- Explainability aspect to recommendation i.e. explaining the attributes of the recommended playlist.
- Interactive Mood controls giving users control over their recommendations. 4) Toggle to recommend songs from "Top 500 songs of 2022" and "Top 500 songs of All Time" playlists i.e. out of corpus recommendations.
- Functionality to recommend songs to new users i.e. Spotify users not part of Lastfm dataset
- Visualizing working of recommendation algorithm using user-neighbour interactions.
This metric measures how many of the recommended results are relevant and are showing at the top. Precision@K is the fraction of relevant items in the top K recommended results. The mean average precision@K measures the average precision@K averaged over all queries in the dataset.
where
if kth songs are relevant and 0 otherwise
- Normalized Discounted Cumulative Gain: Gain refers to the relevance score for each item (song in this case) recommended. Cumulative gain at K is the sum of gains of the first K items recommended. Discounted cumulative gain (DCG) weighs each relevance score based on its position, with a higher weight to the recommendations at the top.
We followed two different approaches to sample our dataset into 80%-20% train-test split. The first approach takes a stratified sampling route where for each user the split is done randomly into 80-20 split. The second approach does an overall random shuffling and we obtain the train (80%) and test (20%).
We conducted user survey to seek feedback on our product. We got 17 responses and the overall response 5 was positive. One promising observation has been a strong positive response for "Explainability Visualiza- tion" from respondents who did not care about explain- ability in the first place