/find-similar-users-based-on-music-taste

In this project, we recommend new friends and their favorite songs to users based on their similarity of music taste. We utilized the user listening habits dataset collected from last.fm

Primary LanguageJupyter NotebookMIT LicenseMIT

Find Similar Users Based on Music Taste

This is the course work of Unsupervised Machine Leraning and Data Mining at Northeastern University completed by me and Chamanthi Aki

Introduction

In the audio streaming and music services industry, platforms like Spotify or Apple Music have already built industry-leading recommendation systems that preciously provide songs that we might like. However, there are two problems:

  • First, they only focus on the music listening experience, while ignoring the social needs and community values when people are drowning in the ocean of music.
  • Second, suggestions generated by algorithms cannot match our wide music tastes. People tend to discover new music mostly through friends’ recommendations.

We want to bridge the gap and fully discover the power of music that could potentially bring people together, build relationships, and further expand their music community. Because of the strong social connections, this would strengthen the tie between users and the music streaming platform, reducing the churn rate and increasing profits as a result.

In this project, we recommend new friends and their favorite songs to users based on their similarity of music taste. We utilized the user listening habits dataset collected from last.fm, where you can find here.

To run the project, make sure your enviroment algins with library versions in requirments.txt.

Project Organization

.
├── .gitignore
├── LICENSE
├── Makefile
├── README.md
├── data
│   ├── external
│   ├── interim
│   │   └── user-listen-count.csv
│   └── processed
│       └── user_track_df_reduced.csv
├── notebooks
│   ├── 0.1-johnny-data_sampling.ipynb
│   ├── 0.2-johnny-preprocess.ipynb
│   ├── 0.3-johnny-dimensionality_reduction.ipynb
│   ├── 1.0-johnny-similarity.ipynb
│   ├── 1.1-johnny-similarity-optimized.ipynb
│   └── 2.0-chams-recommendation.ipynb
├── references
│   └── HMT11-Finding-Structure-SIREV.pdf
├── reports
│   └── figures
│       ├── Histogram of track listened times.png
│       ├── dendrogram.png
│       ├── dendrogram1.png
│       ├── hourly count.png
│       ├── monthly count.png
│       ├── reconstruction error.png
│       ├── scatter after svd.png
│       ├── tsne after.png
│       └── tsne before.png
├── requirements.txt
├── setup.py
└── src
    ├── __init__.py
    ├── data
    │   ├── __init__.py
    │   ├── dimensionality_reduction.py
    │   └── preprocessor.py
    ├── models
    │   ├── __init__.py
    │   ├── predict_model.py
    │   └── train_model.py
    └── visualization
        ├── __init__.py
        └── visualize.py

Project based on the cookiecutter data science project template