- What is DeepRec
- How to Use
- Benchmark Results
- References
DeepRec is a portable, flexible and comprehensive library including a variety of state-of-the-art deep learning based recommendation models. It aims to solve the item ranking task. In current version, DeepRec supports two kinds of methods: feature-based methods and knowledge-enhanced methods. In feature-based methods, deep learning models are applied to the extracted feature files with the specified format. In knowledge-enhanced methods, the signals from knowledge graph are leveraged to improve the recommendation performance. Current supported models are listed in the following, more methods will be expected in the near future.
- Enviroment: linux, python 3
- Dependent packages: tensorflow (>=1.7.0), sklearn, yaml, numpy
- For each method, prepare your data as the corresponding format listed in Table 1.
- Edit the corresponding configuration file listed in Table 1, to set the parameters for your method, such as training filename, testing filename, and so on. In directory /wiki, we give more explainations about each method's related parameters in the configuration file.
- Run this kind of command "python mainArg.py [the choosed model name] train/infer"
Here, we give the example of running ExDeepFM, more examples can be found here.
- Download the data and Prepare the data in the required format (libffm for ExDeepFM). Assume you are at the root directory.
cd data wget http://files.grouplens.org/datasets/movielens/ml-100k.zip unzip ml-100k.zip python ML-100K2Libffm.py
- Edit the corresponding configuration file in /config/exDeepFM.yaml for both training and testing, and then run the following command to override the actually used configuration file.
cp config/exDeepFM.yaml config/network.yaml
- Train the model using the following command. The first argv element ("exDeepFM_Model_1") is the directory name for the results. For example, it will create /cache/exDeepFM_Model_1 directory to save your cache file, /checkpoint/exDeepFM_Model_1 to save your trained model, /logs/exDeepFM_Model_1 to save your training log. The second argv element is about the mode. If you want to train a model, you choose "train". If you want to infer results, you choose "infer".
python mainArg.py exDeepFM_Model_1 train
- Infer the result. Given the trained model in /checkpoint/exDeepFM_Model_1 in step 3, and then run:
python mainArg.py exDeepFM_Model_1 infer
we sample 300w from criteo dataset(dataset), dealing with long tail features and continuous features. the dataset has 26w features and 300w samples.we split the dataset randomly into three parts: 80% is for training, 10% is for validating, 10% is for testing.
model | auc | logloss | train time per epoch/s |
---|---|---|---|
lr | 0.7779 | 0.4692 | 20.4 |
fm | 0.7895 | 0.4591 | 90.8 |
dnn | 0.7939 | 0.4552 | 425.1 |
ipnn | 0.7947 | 0.4546 | 413.3 |
opnn | 0.7957 | 0.4539 | 417.6 |
deepWide | 0.7936 | 0.4557 | 412.4 |
deepFM | 0.7944 | 0.4549 | 680.8 |
we conduct experiment on Company* dataset.the dataset has 20w samples and 19w features.
model | auc | logloss | train time per epoch/s |
---|---|---|---|
lr | 0.6555 | 0.3914 | 21.9 |
fm | 0.6873 | 0.39 | 58.4 |
dnn | 0.7315 | 0.3711 | 201.7 |
ipnn | 0.7297 | 0.3712 | 199.3 |
opnn | 0.7332 | 0.3698 | 197.3 |
deepWide | 0.7346 | 0.3721 | 202.1 |
deepFM | 0.7324 | 0.3759 | 233.6 |
din | 0.7401 | 0.3763 | 331.4 |
- DeepRec supports the mulit-hot data type by default, sparse matrix is used to store data.
- DeepRec is currently designed only for academic experiments, if the number of samples is larger than 1000w, and feature num is larger than 100w, it may suffer from efficiency issues. We are trying to improve efficiency.
- xDeepFM: Combining Explicit and Implicit Feature Interactions for Recommender Systems
- DKN: Deep Knowledge-Aware Network for News Recommendation
- A Factorization-Machine based Neural Network for CTR Prediction
- Deep Learning over Multi-field Categorical Data: A Case Study on User Response Prediction
- Product-based Neural Networks for User Response Prediction
- Wide & Deep Learning for Recommender Systems
- A Content-Boosted Collaborative Filtering Neural Network for Cross Domain Recommender Systems
- product-nets
- RippleNetwork: Propagating User Preferences on the Knowledge Graph for RecommenderSystems