This is my implementation for the paper:
Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu and Tat-Seng Chua (2017). Neural Collaborative Filtering. In Proceedings of WWW '17, Perth, Australia, April 03-07, 2017.
该代码包含GMF、MLP、NeuMF
注:该代码的目标函数和评测指标是均方误差(原文目标函数为交叉熵函数,评测指标为HR和NDCG)。 该代码中模型最后一层没有激活函数(原文中最后一层激活函数为Sigmoid)。
- python 3.8
- pytorch 1.70
- Amazon(2014) http://jmcauley.ucsd.edu/data/amazon/links.html
- Yelp(2020) https://www.yelp.com/dataset
For example:
data/ratings_Digital_Music.csv
(Amazon Digital Music: rating only)
注:数据集前三列分别为用户id、产品id、评分(1~5)。
运行main.py
时,数据集按0.8/0.1/0.1的比例划分为训练集、验证集、测试集。
若使用了amazon/yelp数据集json格式,可使用data_preprocess.py预处理。
For example:
python data_preprocess.py --data_path Digital_Music_5.json --data_source amazon --save_file amazon_music_ratings.csv
Train and evaluate the model
python main.py --dataset_file data/ratings_Digital_Music.csv
Dataset | number of users | number of items | MSE of GMF | MSE of MLP | MSE of NeuMF |
---|---|---|---|---|---|
movielens-small (100,836) | 610 | 9724 | - | - | 0.740655 |
Amazon music-small (64,706) | 5541 | 3568 | - | - | 0.822472 |
Amazon music (836,006) | 478235 | 266414 | - | - | 0.825261 |
Amazon Clothing, Shoes and Jewelry (5,748,919) | 3117268 | 1136004 | 1.520218 | 1.503066 | 1.502135 |
Yelp (8,021,121) | 1968703 | 209393 | 2.222917 | 2.041745 | 2.041674 |