Sequence to Sequence with Attention in TensorFlow and Emotional Chatting Machine

About

This project implements following models with the focus on easy-to-trace-and-understand implementations.

Sequence-to-sequence with Attention (Intro): A general encoder-decoder framework that is widely used in Natural Language Processing such as translation, summarization, conversational bot, and more.
Emotion Regression: A bi-directional LSTM with self-attention that predicts emotion intensities of a sentence [anger, fear, joy, sadness], proposed by Baziotis, Christos, et al. (2018) for solving SemEval-2018 Task 1.
Emotional Chatting Machine: An end-to-end framework from H. Zhou (2018) that can generate emotional responses in open-domain conversational systems (emotional chatbots).

Model Configuration: Support YAML format for easy configurations of model architecture.
Easy-to-trace: Build pipeline components (encoder, decoder, attention, beam search) with proper annotations.
Easy-to-understand: Focus on functional programming style which makes the workflow more understandable.
Checkpoint Availability: Saving and loading of half-trained models are enabled (checkpoints).
Jupyter Notebook: Demonstrate the whole example workflow in Jupyter notebooks to easily play with.

See example folder for sample data.

Embeddings: 128-dimension, 1000 vocabulary size (integers from 0 to 999).

Encoder-Decoder: 2-layer bi-directional LSTM and 2-layer LSTM (256 hidden) with Bahdanau attention mechanism.

Inference: Beam search decoding, beam size 5.

Training: Cross entropy loss, evaluated by perplexity.

Training loss	Perplexity

Prime numbers as simulated emotion words and they are equally split into 4 "emotion" categories.

Sequence with most primes from a certain category is tagged with that "emotion" category.

Emotion Category Embeddings: 32-dimensional vector for each emotion.

Internal Memory: 64-dimensional vector for each emotion.

External Memory: Explicitly promote emotion words against generic words.

Training loss	Perplexity

Clone the repo and use the virtualenv:

git clone https://github.com/AaronYALai/Seq2seqAttnTF

virtualenv env3 -p python3

source env3/bin/activate

cd Seq2seqAttnTF

Install all requirements and run the model on example data and configuration:

pip install -r requirements.txt

python train.py

python infer.py

Emotional Chatting Machine:

python train_ECM.py

python infer_ECM.py

The installation of TensorFlow GPU version:

pip install tensorflow-gpu

Run up Jupyter Notebook for interactive executions (Link):

jupyter notebook

D. Bahdanau, K. Cho, Y. Bengio. "Neural Machine Translation by Jointly Learning to Align and Translate." ICLR 2015.
H. Zhou, M. Huang, T. Zhang, X. Zhu, and B. Liu. "Emotional Chatting Machine: Emotional Conversation Generation with Internal and External Memory." AAAI 2018.
Baziotis, Christos, et al. "NTUA-SLP at SemEval-2018 Task 1: Predicting Affective Content in Tweets with Deep Attentive RNNs and Transfer Learning." arXiv 2018.
Seq2seq with Attention Introduction
Google's Seq2seq with Attention
Im2Latex
SemEval-2018 Task 1