/A-Hierarchical-Latent-Structure-for-Variational-Conversation-Modeling

PyTorch Implementation of "A Hierarchical Latent Structure for Variational Conversation Modeling" (NAACL 2018 Oral)

Primary LanguagePythonMIT LicenseMIT

Variational Hierarchical Conversation RNN (VHCR)

PyTorch 0.4 Implementation of "A Hierarchical Latent Structure for Variational Conversation Modeling" accepted in NAACL 2018 (Oral).

Prerequisite

Install Python packages

pip install -r requirements.txt

Download & Preprocess data

Following scripts will

  1. Create directories ./datasets/cornell/ and ./datasets/ubuntu/ respectively.

  2. Download and preprocess conversation data inside each directory.

python cornell_preprocess.py
    --max_sentence_length (maximum number of words in sentence; default: 30)
    --max_conversation_length (maximum turns of utterances in single conversation; default: 10)
    --max_vocab_size (maximum size of word vocabulary; default: 20000)
    --max_vocab_frequency (minimum frequency of word to be included in vocabulary; default: 5)
    --n_workers (number of workers for multiprocessing; default: os.cpu_count())
python ubuntu_preprocess.py
    --max_sentence_length (maximum number of words in sentence; default: 30)
    --max_conversation_length (maximum turns of utterances in single conversation; default: 10)
    --max_vocab_size (maximum size of word vocabulary; default: 20000)
    --max_vocab_frequency (minimum frequency of word to be included in vocabulary; default: 5)
    --n_workers (number of workers for multiprocessing; default: os.cpu_count())

Training

Go to the model directory and set the save_dir in configs.py (this is where the model checkpoints will be saved)

We provide our implementation of VHCR, as well as our reference implementations for HRED and VHRED.

To run training:

python train.py --data=<data> --model=<model> --batch_size=<batch_size>

For example:

  1. Train HRED on Cornell Movie:
python train.py --data=cornell --model=HRED
  1. Train VHRED with word drop of ratio 0.25 and kl annealing iterations 250000:
python train.py --data=ubuntu --model=VHRED --batch_size=40 --word_drop=0.25 --kl_annealing_iter=250000
  1. Train VHCR with utterance drop of ratio 0.25:
python train.py --data=ubuntu --model=VHCR --batch_size=40 --sentence_drop=0.25 --kl_annealing_iter=250000

By default, it will save a model checkpoint every epoch to <save_dir> and a tensorboard summary. For more arguments and options, see config.py.

Evaluation

To evaluate the word perplexity:

python eval.py --model=<model> --checkpoint=<path_to_your_checkpoint>

For embedding based metrics, you need to download Google News word vectors, unzip it and put it under the datasets folder. Then run:

python eval_embed.py --model=<model> --checkpoint=<path_to_your_checkpoint>

Reference

If you use this code or dataset as part of any published research, please refer the following paper.

@inproceedings{VHCR:2018:NAACL,
    author    = {Yookoon Park and Jaemin Cho and Gunhee Kim},
    title     = "{A Hierarchical Latent Structure for Variational Conversation Modeling}",
    booktitle = {NAACL},
    year      = 2018
    }