Modeling Past and Future for Neural Machine Translation
If you use the code, which is implemeted on the popular codebase Nematus, please cite our paper:
@article{Zheng:2018:TACL,
author = {Zheng, Zaixiang and Zhou, Hao and Huang, Shujian and Mou, Lili and Dai Xinyu and Chen, Jiajun and Tu, Zhaopeng},
title = {Modeling Past and Future for Neural Machine Translation},
journal = {Transactions of the Association for Computational Linguistics},
year = {2018},
}
- python2.7
- Theano >= 0.9
- mosesdecoder (only scripts needed)
- cuda >= 8.0
- Data preparation
- Pretraining a RNNSearch model on Nematus
- Training
- Testing
- Data Cleaning: filter out bad characters, unaligned sentence pairs
- Tokenization: tokenizer.pl from MosesDecoder
- Lowercase: if needed
- Subword: use Byte-Pair Encoding
- Use Nematus to train a baseline model
- Or you can download the pretrained models here(not uploaded yet)
Run ./scripts/train.sh
(edit it if needed) for training. See ./scripts/train.sh
for details.
option | description (value) |
---|---|
--use_past_layer | (bool, default: False) whether to apply past layer |
--use_future_layer | (bool, default: False) whether to apply future layer |
--future_layer_type | (str, default: "gru_inside") type of RNN cell for future layer, only support ["gru", "gru_outside", "gru_inside"] |
--use_subtractive_loss | (bool, default: False) whether to use subtractive loss on past or(and) future layer |
--use_testing_loss | (bool, default: False) whether to use subtractive loss during testing phase |
Run ./scripts/test.sh
(edit it if needed) for testing. See ./scripts/test.sh
for details.