This repository contains the code for the IJCNLP'2017 paper Attentive Language Models. The code is largely based on Tensorflow's tutorial on Recurrent Neural Networks. The code allow to train models using both single and combined scores as described in the paper on the PTB and wikitext2 datasets, although it is likely to be used with other datasets.
- Python 3
- Tensorflow >= 1.3.0
- NLTK
- We also need to install NLTK's default sentence tokenizer in order to split the sentences in wikitext2.
To download PTB and to download and pre-process the wikitext2, please run
./get_and_process_data.sh
from the root directory. This script will create a folder called data
containing the downloaded datasets (each one in its respective folder).
The scripts for training the models expect the datasets to be in this folder.
You can control the dataset and model to train using the --config
flag:
ptb_single
trains an Attentive-LM with single score on the PTBptb_combined
trains an Attentive-LM with combined score on the PTBwikitext2_single
trains an Attentive-LM with single score on the wikitext2wikitext2_combined
trains an Attentive-LM with combined score on the wikitext2
There are 2 options to run the experiments:
Simply open start_train.sh
and change the --config
flag in the script to the model you wish to train. Then run from the root folder:
./start_train.sh
export MODEL_DIR=${HOME}/train_lms/ptb_single
mkdir -p $MODEL_DIR
python3 -u train_attentive_lm.py \
--config="ptb_single" \
--train_dir=$MODEL_DIR \
--best_models_dir=$MODEL_DIR/best_models