VKDE: A Python repository from hugh2009hugh

VKDE-pytorch

Introduction

Top-N recommendation is widely accepted as an effective method in personalized service that well serves users of different interests. However, as observed from the data, the user activity level also plays an important role in the recommendation. Existing studies do not pay a high attention to this issue, which simply assume the preference of all users follows a common probability distribution and then use a fixed schema (e.g., one latent vector) to model user representation. This assumption makes existing models hard to accommodate users of different activity levels. In this work, we propose a Variational Kernel Density Estimation (VKDE) model, a non-parametric estimation, which aims to fit arbitrary preference distributions for users. VKDE divides user representation into multiple latent vectors, each of which corresponds to user one-faceted interest. Multiple local distributions are generated by variational kernel function, and then aggregated as the global preference distribution of the user. To reduce training complexity and keep the recommendation effectiveness, a sampling strategy is further proposed. Our experimental results on three public datasets show that VKDE outperforms SOTAs and greatly improves the accuracy for users of different activity levels.

Enviroment Requirement

pip install -r requirements.txt

Dataset

We provide three processed datasets: Yelp2018, Amazon-book and Video Games (the other two datasets will be uploaded soon).

see more in dataloader.py

An example to run VKDE without sampling

run VKDE on Yelp2018 dataset:

change base directory

Change ROOT_PATH in src/world.py

command

` cd code/src && python main.py --dataset yelp2018 --topks=[20] --model VKDE --epoch 400 --tau_model2 0.1 --reg_model2 0.001 --dropout_model2 0.5 --lr 0.001 --cuda 0 --enc_dims [64]

log output

...
======================
{'precision': array([0.03109132]), 'recall': array([0.06926436]), 'ndcg': array([0.05608385])}
EPOCH[11/400] Elapsed time: 85.2 Neg_ll: 148403.24
...
======================
{'precision': array([0.03548693]), 'recall': array([0.0784675]), 'ndcg': array([0.06469457])}
EPOCH[171/400] Elapsed time: 86.6 Neg_ll: 129988.29
...

An example to run VKDE with sampling (Recommend)

` cd code/src && python main.py --dataset yelp2018 --topks=[20] --model VKDE --epoch 400 --tau_model2 0.1 --reg_model2 0.001 --dropout_model2 0.5 --lr 0.001 --enc_dims [64] --sampling 1 --cuda 0

Results

all metrics is under top-20, shown in the paper

pytorch version results (stop at 400 epochs):

(for seed=2022)