/Neural_Collaborative_Filtering-1

This is a modified implementation of NeuCF,which used latest version of keras and tensorflow

Primary LanguagePythonApache License 2.0Apache-2.0

Neural Collaborative Filtering

This is a modified implementation of the following article,which used latest version of keras and tensorflow:

Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu and Tat-Seng Chua (2017). Neural Collaborative Filtering. In Proceedings of WWW '17, Perth, Australia, April 03-07, 2017.

Three collaborative filtering models: Generalized Matrix Factorization (GMF), Multi-Layer Perceptron (MLP), and Neural Matrix Factorization (NeuMF). To target the models for implicit feedback and ranking task, we optimize them using log loss with negative sampling.

Environment Settings

  • Keras version: '2.1.4'
  • Tensorflow version: '1.7'

Example to run the codes.

The instruction of commands has been clearly stated in the codes (see the parse_args function).

Run GMF:

python GMF.py --dataset ml-1m --epochs 20 --batch_size 256 --num_factors 8 --regs [0,0] --num_neg 4 --lr 0.001 --learner adam --verbose 1 --out 1

Run MLP:

python MLP.py --dataset ml-1m --epochs 20 --batch_size 256 --layers [64,32,16,8] --reg_layers [0,0,0,0] --num_neg 4 --lr 0.001 --learner adam --verbose 1 --out 1

Run NeuMF (without pre-training):

python NeuMF.py --dataset ml-1m --epochs 20 --batch_size 256 --num_factors 8 --layers [64,32,16,8] --reg_mf 0 --reg_layers [0,0,0,0] --num_neg 4 --lr 0.001 --learner adam --verbose 1 --out 1

Run NeuMF (with pre-training):

python NeuMF.py --dataset ml-1m --epochs 20 --batch_size 256 --num_factors 8 --layers [64,32,16,8] --num_neg 4 --lr 0.001 --learner adam --verbose 1 --out 1 --mf_pretrain Pretrain/ml-1m_GMF_8_1501651698.h5 --mlp_pretrain Pretrain/ml-1m_MLP_[64,32,16,8]_1501652038.h5

Note on tuning NeuMF: our experience is that for small predictive factors, running NeuMF without pre-training can achieve better performance than GMF and MLP. For large predictive factors, pre-training NeuMF can yield better performance (may need tune regularization for GMF and MLP).

Dataset

We provide two processed datasets: MovieLens 1 Million (ml-1m) and Pinterest (pinterest-20).

train.rating:

  • Train file.
  • Each Line is a training instance: userID\t itemID\t rating\t timestamp (if have)

test.rating:

  • Test file (positive instances).
  • Each Line is a testing instance: userID\t itemID\t rating\t timestamp (if have)

test.negative

  • Test file (negative instances).
  • Each line corresponds to the line of test.rating, containing 99 negative samples.
  • Each line is in the format: (userID,itemID)\t negativeItemID1\t negativeItemID2 ...

Last Update Date: August 2, 2018