/LOCA

This is the code for the WSDM 2021 paper: 'Local Collaborative Filtering'.

Primary LanguagePython

LOCA

This is the official code for the WSDM 2021 paper: Local Collaborative Autoencoders.

The slides can be found here.


Dataset

Dataset # Users # Items # Ratings Sparsity Concentration
ML10M 69,878 10,677 10,000,054 98.66% 48.04%
ML20M 138,493 26,744 20,000,263 99.46% 66.43%
AMusic 4,964 11,797 97,439 99.83% 14.93%
AGames 13,063 17,408 236,415 99.90% 16.40%
Yelp 25,677 25,815 731,671 99.89% 22.78%

We use five public benchmark datasets: MovieLens 10M (ML10M), MovieLens 20M (ML20M), Amazon Digital Music (AMusic), Amazon Video Games (AGames), and Yelp 2015 (Yelp) datasets. We convert all explicit ratings to binary values, whether the ratings are observed or missing. For the MovieLens datasets, we did not modify the original data except for binarization. For the Amazon datasets, We removed users with ratings less than 10, resulting in 97,439 (Music) and 236,415 (Games) ratings. For the Yelp dataset, we pre-processed Yelp 2015 challenge dataset as in Fast Matrix Factorization for Online Recommendation with Implicit Feedback , where users and items with less than 10 interactions are removed.

You can get preprocessed datasets from the link below.

https://drive.google.com/drive/folders/1DqchJ1RR2TZRNoVeU3MXcXLcMJG0fia_?usp=sharing

You can get the original datasets from the following links:

Movielens: https://grouplens.org/datasets/movielens/

Amazon Review Data: https://nijianmo.github.io/amazon/

Yelp 2015: https://github.com/hexiangnan/sigir16-eals/tree/master/data


Basic Usage

  • Change the experimental settings in main_config.cfg and the model hyperparameters in model_config.
  • Run main.py to train and test models.
  • Command line arguments are also acceptable with the same naming in configuration files. (Both main/model config)

For example: python main.py --model_name MultVAE --lr 0.001

Running LOCA

Before running LOCA, you need (1) user embeddings to find local communities and (2) the global model to cover users who are not considered by local models.

  1. Run single MultVAE and EASE to get user embedding vectors and the global model:

python main.py --model_name MultVAE and python main.py --model_name EASE

  1. Train LOCA with the specific backbone model:

python main.py --model_name LOCA_VAE and python main.py --model_name LOCA_EASE


Requirements

  • Python 3
  • Torch 1.5