CIKM 2021: Learning Implicit User Profile for Personalized Retrieval-based Chatbot (pdf)
In the work, one of the datasets we use is the PchatbotW dataset, please refer to this link for details.
In this paper, we evaluate IMPChat on two datasets, Weibo and Reddit:
Dataset:
Baidu Disk: Link (4hv3)
Google Storage:Link
Embedding:
Baidu Disk:Link (fob2)
Google Storage:Link
Answer Relevance:
Baidu Disk: Link (v6pv)
Google Storage:Link
Dataset:
Baidu Disk: Link (vg1j)
Google Storage:Link
Embedding:
Baidu Disk:Link (nnie)
Google Storage:Link
Answer Relevance:
Baidu Disk: Link (8mci)
Google Storage:Link
Note that the embeddings are trained on the corresponding dataset. The Answer Relevance file contains the candidate relevances.
Download the datasets and put them on the dataset directory.
Run with:
ts=`date +%Y%m%d%-H%M`
dataset=weibo # or reddit
CUDA_VISIBLE_DEVICES=4,6 python run.py \
--task ${dataset} \
--batch_size 128 \
--eval_steps 5000 \
--emb_len 200 \
--max_utterances 29 \
--learning_rate 5e-4\
--max_words 50 \
--n_gpu 2 \
--epochs 10 \
--n_layer 3 \
--max_hop 2 \
--score_file_path score_file.txt \
--model_file_name ${dataset}_impchat.pt\
--is_training True
Download the checkpoint files and place them under the checkpoint directory:
Baidu Disk: Link (koon)
Google Storage: Link
Baidu Disk: Link (v7ck)
Google Storage: Link
Run with:
ts=`date +%Y%m%d%-H%M`
dataset=weibo # or reddit
CUDA_VISIBLE_DEVICES=4,6 python run.py \
--task ${dataset} \
--batch_size 128 \
--eval_steps 5000 \
--emb_len 200 \
--max_utterances 29 \
--learning_rate 5e-4\
--max_words 50 \
--n_gpu 2 \
--epochs 10 \
--n_layer 3 \
--max_hop 2 \
--score_file_path score_file.txt \
--model_file_name ${dataset}_impchat.pt
You can download all score files of the baseline models we use in the following links:
Baidu Disk Link (vof6)
Google Storage:Link
Each score file is named as {model name}.{task} (e.g. imp.weibo). You can compute the metrics by:
python metrics.py
@inproceedings{qian2021impchat,
author = {Hongjin Qian and Zhicheng Dou and Yutao Zhu Yueyuan Ma and Ji-Rong Wen},
title = {Learning Implicit User Profile for Personalized Retrieval-based Chatbot},
booktitle = {Proceedings of the {CIKM} 2021},
publisher = {{ACM}},
year = {2021},
url = {https://doi.org/10.1145/3459637.3482269},
doi = {10.1145/3459637.3482269}
@inproceedings{qian2021pchatbot,
author = {Hongjin Qian and Xiaohe Li and Hanxun Zhong and Yu Guo and Yueyuan Ma and Yutao Zhu and Zhanliang Liu and Zhicheng Dou and Ji-Rong Wen},
title = {Pchatbot: A Large-Scale Dataset for Personalized Chatbot},
booktitle = {Proceedings of the {SIGIR} 2021},
publisher = {{ACM}},
year = {2021},
url = {https://doi.org/10.1145/3404835.3463239},
doi = {10.1145/3404835.3463239}}