The implementation of reranking method proposed in Multi-Level Interaction Reranking with User Behavior History
tensorflow-gpu >= 1.9.0,<2
numpy >= 1.16.4
scikit-learn >= 0.21.2
lightgbm >= 2.3.2
Download data from Ad and PRM Public, and preprocess
python preprocess_ad.py
python preprocess_prm.py
Run initial ranker
python run_init_ranker.py
Run re-ranker
python run_mir.py
Model parameters can be set by using a config file, and specify its file path at --setting_path
, e.g., python run_ranker.py --setting_path config
. The config files for the different models can be found in config/ad/
and config/prm/
. Moreover, model parameters can also be directly set from the command line. The supported parameters are listed as follows.
argument | usage |
---|---|
--data_dir |
The path to the directory where the data is stored |
--save_dir |
The path to the directory where the models and logs are stored |
--model_type |
The algorithm of reranker, including DNN , DIN , and LambdaMART PLEASE ATTENTION: Before training lambdaMART , you need to train DNN to get the pre-trained embedding |
--setting_path |
The path to the json config file, like files in example\config |
--max_hist_len |
The max length of history |
--data_set_name |
The name of the dataset, such as ad and prm |
--epoch_num |
The number of epoch for DNN model |
--batch_size |
Batch size for DNN model |
--lr |
Learning rate for DNN and lambdaMART |
--l2_reg |
The coefficient of l2 regularization for DNN model |
--eb_dim |
The size of embedding for DNN model |
--tree_num |
The number of trees for lambdaMART model |
--tree_type |
The type of tree for lambdaMART model, including lgb and sklearn |
argument | usage |
---|---|
--data_dir |
The path to the directory where the data is stored |
--save_dir |
The path to the directory where the models and logs are stored |
--setting_path |
The path to the json config file, like files in example\config |
--data_set_name |
The name of the dataset, such as ad and prm |
--initial_ranker |
The name of initial ranker, including DNN , lambdaMART . |
--epoch_num |
The number of epoch |
--max_hist_len |
The max length of history |
--batch_size |
Batch size |
--lr |
Learning rate |
--l2_reg |
The coefficient of l2 regularization |
--eb_dim |
The size of embedding |
--hidden_size |
The size of hidden unit, usually the hideen size of LSTM/GRU |
--keep_prob |
Keep prob in dropout |
--metric_scope |
The scope of metrics, for example when --metric_scope=[1, 3, 5] , MAP@1, MAP@3, and MAP@5 will be computed |
--max_norm |
The max norm of gradient clip |
We process two datasets, Ad and PRM Public, containing user and item features with recommendation lists for the experimentation with personalized re-ranking.
preprocess_ad.py and preprocess_prm.py: process Ad and PRM Public, respectively.
initial_model.py implements three initial ranking algorithms:
-
DNN: a naive algorithm that directly train a multi-layer perceptron network with input labels (e.g., clicks).
-
LambdaMART: the implementation of the LambdaMART model in From RankNet to LambdaRank to LambdaMART: An Overview
-
DIN: the implementation of Deep Interest Network in Deep Interest Network for Click-Through Rate Prediction
run_init_ranker.py: the main function of initial ranker
model.py: the implementation of our proposed reranking method, MIR.
run_mir.py: the main function of reranker.