An Attention-based Seq2seq Neural Network Chatbot

Info

A Generative word-level chatbot with PyTorch trained on Microsoft's MetaLWOz data, hacked in a few days.

Code is documented along with really illustrative comments in the pytorch-seq2seq-chatbook.ipynb notebook.

You can view and run it step-by-step. Care, training time depends on the size of your data and your neural net/hardware configuration.

Technologies

Python 3
PyTorch
...and a really expensive GPU

Different versions

Before PyTorch, I also tried a char-level Keras model and a word-level TensorFlow model on the unfiltered dataset, which struggled a lot with memory usage causing multiple out-of-memory (OOM) problems, even on a Tesla K80 GPU.

You can find them on the appropriate different-versions folder.

Some workaround ideas:

removing sentences with a high number of words
Trying different amounts of hidden dimensions, units, etc

Future Work:

Importing pre-trained embeddings like GloVe
Use SOTA Transformers like BERT
Use early stopping on a validation set
Add chit-chat dialogues to the dataset

eloukas/seq2seq-chatbot

An Attention-based Seq2seq Neural Network Chatbot

Info

Technologies

Different versions

Future Work: