Research Benchmarks for Online Prediction Tasks. The initial commit is transferred from DeepInterestNetwork, dnn_ctr and kaggle-2014-criteo, we acknowledge and appreciate the original authors' efforts here.
This repo is under construction...
Deep Interest Network for Click-Through Rate Prediction
This is an implementation of the paper Deep Interest Network for Click-Through Rate Prediction Guorui Zhou, Chengru Song, Xiaoqiang Zhu, Han Zhu, Ying Fan, Na Mou, Xiao Ma, Yanghui Yan, Xingya Dai, Junqi Jin, Han Li, Kun Gai
Thanks Jinze Bai and Chang Zhou.
Bibtex:
@article{Zhou2017Deep,
title={Deep Interest Network for Click-Through Rate Prediction},
author={Zhou, Guorui and Song, Chengru and Zhu, Xiaoqiang and Ma, Xiao and Yan, Yanghui and Dai, Xingya and Zhu, Han and Jin, Junqi and Li, Han and Gai, Kun},
year={2017},
}
- Python >= 3.6.1
- NumPy >= 1.15.0
- Pandas >= 0.23.4
- TensorFlow >= 1.8.0
- GPU with memory >= 10G
- Step 1: Download the amazon product dataset of electronics category, which has 498,196 products and 7,824,482 records, and extract it to
raw_data/
folder.
mkdir raw_data/;
cd utils;
bash 0_download_raw.sh;
- Step 2: Convert raw data to pandas dataframe, and remap categorical id.
python 1_convert_pd.py;
python 2_remap_id.py
This implementation not only contains the DIN method, but also provides all the competitors' method, including Wide&Deep, PNN, DeepFM. The training procedures of all method is as follows:
- Step 1: Choose a method and enter the folder.
cd din;
Alternatively, you could also run other competitors's methods directly by cd deepFM
cd pnn
cd wide_deep
,
and follow the same instructions below.
- Step 2: Building the dataset adapted to current method.
python build_dataset.py
- Step 3: Start training and evaluating using default arguments in background mode.
python train.py >log.txt 2>&1 &
- Step 4: Check training and evaluating progress.
tail -f log.txt
tensorboard --logdir=save_path
There is also an implementation of Dice in folder 'din', you can try dice following the code annotation in din/model.py
or replacing model.py with model_dice.py
The framework to deal with ctr problem
details: https://zhuanlan.zhihu.com/p/32885978
FNN's introduction and api: https://zhuanlan.zhihu.com/p/33045184
PNN's introduction and api: https://zhuanlan.zhihu.com/p/33177517
DeepFM's introduction and api: https://zhuanlan.zhihu.com/p/33479030
AFM's introduction and api: https://zhuanlan.zhihu.com/p/33540686
NFM's introduction and api: https://zhuanlan.zhihu.com/p/33587540
DCN's introduction and api: https://zhuanlan.zhihu.com/p/33619389