sequential-skip-prediction

Spotify Sequential Skip Prediction

RESEARCH SCENARIO

As a music listener, it’s intriguing to see how well Spotify understands the user’s music taste. Two key features of Spotify are:

“Shuffle song”: which though is a random operation, is critical to form a good user experience. This operation should be random enough, but confined to the user's taste.
“Skip button”: This article explains about the relationship between skip button and user experience. So, a skip button gives us information about the user’s music taste and indeed, skip rate depends on a lot of factors like the user's age, current hour of day, whether day is weekend or weekday.

So, the question that comes is whether we can predict whether a user will skip a song shortly after play time which is an indication of non-interest, when a playlist of songs are played sequentially based on historical play data of the user. This will give us an intuition about the user's music preference and help to design a “Shuffle Song” feature which is more focused on user experience.

DATA SET

A dataset is divided into different sessions and every session has some sequence of tracks within a range of 0-20:

Session ID
Track ID

Every track has metadata which gives us details about music like acousticness, beat strength etc:

Duration - duration of the song
Release_year - release year
Us_popularity_estimate - US popularity
Acousticness
Beat_strength
Bounciness
Danceability
Dyn_range_mean
Energy
Flatness
Instrumentalness
Key
Liveness
Loudness
Mechanism
Mode
Organism
Speechiness
Tempo
Time_signature
Valence
Acoustic_vector_0
Acoustic_vector_1
Acoustic_vector_2
Acoustic_vector_3
Acoustic_vector_4
Acoustic_vector_5
Acoustic_vector_6
Acoustic_vector_7

Rest further, dataset contains details about user interaction and playlist details:

Session_position - Position in sequence in a single session
Session_length - Total session length
Skip_1 - Boolean indicating if the track was only played very briefly
Skip_2 - Boolean indicating if the track was only played briefly
Skip_3 - Boolean indicating if most of the track was played
Not_skipped - Boolean indicating that the track was played in its entirety
Context_switch - Boolean indicating if the user changed context between the previous row and the current row. This could for example occur if the user switched from one playlist to another.
No_pause_before_play - Boolean indicating if there was no pause between playback of the previous track and this track
Short_pause_before_play - Boolean indicating if there was a short pause between playback of the previous track and this track
Long_pause_before_play - Boolean indicating if there was a long pause between playback of the previous track and this track
Hist_user_behavior_n_seekfwd - Number of times the user did a seek forward within track
Hist_user_behavior_n_seekback - Number of times the user did a seek back within track
Hist_user_behavior_is_shuffle - Boolean indicating if the user encountered this track while shuffle mode was activated
Hour_of_day - {0-23} - The hour of day.
Date - The date.
Premium - Boolean indicating if the user was on premium or not. This has potential implications for skipping behavior.
Context_type - what type of context the playback occurred within
Hist_user_behavior_reason_start - - the user action which led to the current track being played.
Hist_user_behavior_reason_end - the user action which led to the current track playback ending.

Achint08/sequential-skip-prediction

sequential-skip-prediction

RESEARCH SCENARIO

DATA SET

Thank you :)