/seq2seq-pytorch

Sequence to Sequence Models in PyTorch

Primary LanguagePython

Sequence to Sequence Models in PyTorch

Minimal implementations of sequence to sequence models in PyTorch.

  • RNN Encoder-Decoder (Cho et al 2014; Luong et al 2015; Gu et al 2016)
  • Pointer Networks (Vinyals et al 2015)
  • CNNs from "Convolutional Sequence to Sequence Learning" (Gehring et al 2017)
  • The Transformer from "Attention Is All You Need" (Vaswani et all 2017)

References

Rami Al-Rfou, Dokook Choe, Noah Constant, Mandy Guo, Llion Jones. Character-Level Language Modeling with Deeper Self-Attention. arXiv:1808.04444.

Philip Arthur, Graham Neubig, Satoshi Nakamura. 2016. Incorporating Discrete Translation Lexicons into Neural Machine Translation. In EMNLP.

Jimmy Lei Ba, Jamie Ryan Kiros, Geoffrey E. Hinton. 2016. Layer Normalization. arXiv:1607.06450.

Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio. 2015. Neural Machine Translation by Jointly Learning to Align and Translate. arXiv:1409.0473.

James Bradbury, Stephen Merity, Caiming Xiong, Richard Socher. 2016. Quasi-Recurrent Neural Networks. arXiv:1611.01576.

Denny Britz, Anna Goldie, Minh-Thang Luong, Quoc Le. 2017. Massive Exploration of Neural Machine Translation Architectures. arXiv:1703.03906.

Kyunghyun Cho, Bart van Merrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, Yoshua Bengio. 2014. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. arXiv:1406.1078.

Andrew M. Dai, Quoc V. Le. Semi-supervised Sequence Learning. arXiv:1511.01432.

Zihang Dai, Zhilin Yang, Yiming Yang, Jaime Carbonell, Quoc V. Le, Ruslan Salakhutdinov. 2019. Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context. In ACL.

Jiachen Du, Wenjie Li, Yulan He, Ruifeng Xu, Lidong Bing, Xuan Wang. 2018. Variational Autoregressive Decoder for Neural Response Generation. In EMNLP.

Jonas Gehring, Michael Auli, David Grangier, Denis Yarats, Yann N. Dauphin. 2017. Convolutional Sequence to Sequence Learning. arXiv:1705.03122.

Alex Graves. 2013. Generating Sequences With Recurrent Neural Networks. arXiv:1308.0850.

Jiatao Gu, Zhengdong Lu, Hang Li, Victor O.K. Li. 2016. Incorporating Copying Mechanism in Sequence-to-Sequence Learning. In ACL.

Jeremy Hylton. 1993. The Complete Works of William Shakespeare. http://shakespeare.mit.edu.

Marcin Junczys-Dowmunt. Dual Conditional Cross-Entropy Filtering of Noisy Parallel Corpora. In Proceedings of the Third Conference on Machine Translation (WMT): Shared Task Papers.

Łukasz Kaiser, Samy Bengio. 2018. Discrete Autoencoders for Sequence Models. arXiv:1801.09797.

Jing Li, Aixin Sun, Shafiq Joty. 2018. SEGBOT: A Generic Neural Text Segmentation Model with Pointer Network. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence.

Jiwei Li. 2017. Teaching Machines to Converse. Doctoral dissertation. Stanford University.

Junyang Lin, Xu Sun, Xuancheng Ren, Muyu Li, Qi Su. 2018. Learning When to Concentrate or Divert Attention: Self-Adaptive Attention Temperature for Neural Machine Translation. arXiv:1808.07374.

Minh-Thang Luong, Hieu Pham, Christopher D. Manning. 2015. Effective Approaches to Attention-based Neural Machine Translation. In EMNLP.

Xuezhe Ma, Zecong Hu, Jingzhou Liu, Nanyun Peng, Graham Neubig, Eduard Hovy. 2018. Stack-Pointer Networks for Dependency Parsing.. In ACL.

Hideya Mino, Masao Utiyama, Eiichiro Sumita, Takenobu Tokunaga. 2017. Key-value Attention Mechanism for Neural Machine Translation. In Proceedings of the 8th International Joint Conference on Natural Language Processing.

Chan Young Park, Yulia Tsvetkov. Learning to Generate Word- and Phrase-Embeddings for Efficient Phrase-Based Neural Machine Translation. In Proceedings of the 3rd Workshop on Neural Generation and Translation.

Ofir Press, Lior Wolf. 2016. Using the Output Embedding to Improve Language Models. arXiv:1608.05859.

Abigail See, Peter J. Liu, Christopher D. Manning. 2017. Get To The Point: Summarization with Pointer-Generator Networks. arXiv:1704.04368.

Xiaoyu Shen, Hui Su, Shuzi Niu, Vera Demberg. 2018. Improving Variational Encoder-Decoders in Dialogue Generation. In AAAI.

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin. 2017. Attention Is All You Need. In NIPS.

Oriol Vinyals, Meire Fortunato, Navdeep Jaitly. 2015. Pointer Networks. In NIPS.

Oriol Vinyals, Samy Bengio, Manjunath Kudlur. 2015. Order Matters: Sequence to sequence for sets. In ICLR.

Sean Welleck, Ilia Kulikov, Stephen Roller, Emily Dinan, Kyunghyun Cho, Jason Weston. 2019. Neural Text Generation with Unlikelihood Training. arXiv:1908.04319.

Sam Wiseman, Alexander M. Rush. Sequence-to-Sequence Learning as Beam-Search Optimization. arXiv:1606.02960.

Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V. Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, Jeff Klingner, Apurva Shah, Melvin Johnson, Xiaobing Liu, Łukasz Kaiser, Stephan Gouws, Yoshikiyo Kato, Taku Kudo, Hideto Kazawa, Keith Stevens, George Kurian, Nishant Patil, Wei Wang, Cliff Young, Jason Smith, Jason Riesa, Alex Rudnick, Oriol Vinyals, Greg Corrado, Macduff Hughes, Jeffrey Dean. 2016. Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation. arXiv:1609.08144.

Ziang Xie. 2018. Neural Text Generation: A Practical Guide. http://cs.stanford.edu/~zxie/textgen.pdf.

Feifei Zhai, Saloni Potdar, Bing Xiang, Bowen Zhou. 2017. Neural Models for Sequence Chunking. In AAAI.

Saizheng Zhang, Emily Dinan, Jack Urbanek, Arthur Szlam, Douwe Kiela, Jason Weston. 2018. Personalizing Dialogue Agents: I have a dog, do you have pets too?. arXiv:1801.07243.