This is the codebase for ICLR 2021 paper: Modelling Hierarchical Structure between Dialogue Policy and Natural Language Generator with Option Framework for Task-oriented Dialogue System.
If you use any source codes included in this toolkit in your work, please cite the following paper. The bibtex is listed below:
@article{wang2020modelling,
title={Modelling hierarchical structure between dialogue policy and natural language generator with option framework for task-oriented dialogue system},
author={Wang, Jianhong and Zhang, Yuan and Kim, Tae-Kyun and Gu, Yunjie},
journal={arXiv preprint arXiv:2006.06814},
year={2020}
}
- configs: hyperparameters for different experiments
- data: dialogue dataset (MultiWowz 2.0 & MultiWoz 2.1)
- latent_dialog: source code
- human_evaluator.py: to evalute groud truth's performance
- supervised.py: main entry for Supervised Learning
- reinforce.py: main entry for Reinforcement Learning
- install conda environment
conda create -n hdno python=3.6
conda activate hdno
- install requirements
pip install -r requirements.txt
Before any operations below, please prepare your data following the script:
unzip data/multiwoz_2.0.zip -d data
unzip data/multiwoz_2.1.zip -d data
We give a script that can evaluate HDNO
and show the results shown on the paper, based on the pretrained models we provide on MultiWoz 2.0 and MultiWoz 2.1:
sh reproduce.sh
For the convenience for freely training models, we give simple bash scripts to do it.
- pretraining
sh train.sh sl woz2.0 # For MultiWoz 2.0
sh train.sh sl woz2.1 # For MultiWoz 2.1
- hierarchical reinforcement learning (HRL)
sh train.sh rl woz2.0 # For MultiWoz 2.0
sh train.sh rl woz2.1 # For MultiWoz 2.1
- evaluating trained model
sh test.sh sl woz2.0 5 # For MultiWoz 2.0 pretrained model
sh test.sh sl woz2.1 5 # For MultiWoz 2.1 pretrained model
sh test.sh rl woz2.0 2 # For MultiWoz 2.0 HRL model
sh test.sh rl woz2.1 5 # For MultiWoz 2.1 HRL model