/seq2seq-chatbot

A sequence2sequence chatbot implementation with TensorFlow.

Primary LanguagePythonMIT LicenseMIT

seq2seq-chatbot

A sequence2sequence chatbot implementation with TensorFlow.

See instructions to get started below, or check out some chat logs

Chatting with a trained model

To chat with a trained model from the model directory:

(Batch files are only available for windows as of now. For mac and linux users see instructions below for python console.)

  1. Make sure a model exists in the models directory (to get started, download and unzip trained_model_v2 into the seq2seq-chatbot/models/cornell_movie_dialog folder)

For console chat:

  1. From the model directory run chat_console_best_weights_training.bat or chat_console_best_weights_validation.bat

For web chat:

  1. From the model directory run chat_web_best_weights_training.bat or chat_web_best_weights_validation.bat

  2. Open a browser to the URL indicated by the server console, followed by /chat_ui.html. This is typically: http://localhost:8080/chat_ui.html

To chat with a trained model from a python console:

  1. Set console working directory to the seq2seq-chatbot directory. This directory should have the models and datasets directories directly within it.

  2. Run chat.py with the model checkpoint path:

run chat.py models\dataset_name\model_name\checkpoint.ckpt

For example, to chat with the trained cornell movie dialog model trained_model_v2:

  1. Download and unzip trained_model_v2 into the seq2seq-chatbot/models/cornell_movie_dialog folder

  2. Set console working directory to the seq2seq-chatbot directory

  3. Run:

run chat.py models\cornell_movie_dialog\trained_model_v2\best_weights_training.ckpt

The result should look like this:

chat

Training a model

To train a model from a python console:

  1. Configure the hparams.json file to the desired training hyperparameters

  2. Set console working directory to the seq2seq-chatbot directory. This directory should have the models and datasets directories directly within it.

  3. To train a new model, run train.py with the dataset path:

run train.py --datasetdir=datasets\dataset_name

Or to resume training an existing model, run train.py with the model checkpoint path:

run train.py --checkpointfile=models\dataset_name\model_name\checkpoint.ckpt

For example, to train a new model on the cornell movie dialog dataset with default hyperparameters:

  1. Set console working directory to the seq2seq-chatbot directory

  2. Run:

run train.py --datasetdir=datasets\cornell_movie_dialog

The result should look like this:

train

Transfer learning with pre-trained embeddings:

Docs coming soon...

Visualizing a model in TensorBoard

TensorBoard is a great tool for visualizing what is going on under the hood when a TensorFlow model is being trained.

To start TensorBoard from a terminal:

tensorboard --logdir=model_dir

Where model_dir is the path to the directory where the model checkpoint file is. For example, to view the trained cornell movie dialog model trained_model_v2:

tensorboard --logdir=models\cornell_movie_dialog\trained_model_v2

Visualize Training

Docs coming soon...

Visualize model graph

Docs coming soon...

Visualize word embeddings

TensorBoard can project the word embeddings into 3D space by performing a dimensionality reduction technique like PCA or T-SNE, and can allow you to explore how your model has grouped together the words in your vocabulary by viewing nearest neighbors in the embedding space for any word. More about word embeddings in TensorFlow and the TensorBoard projector can be found here.

When launching TensorBoard for a model directory and selecting the "Projector" tab, it should look like this: train

Adding a new dataset

Instructions coming soon...

Dependencies

The following python packages are used in seq2seq-chatbot: (excluding packages that come with Anaconda)

Roadmap

See the Roadmap Page

Acknowledgements

This implementation was inspired by:

Relevant papers

  1. Sequence to Sequence Learning with Neural Networks

  2. A Neural Conversational Model

  3. Neural Machine Translation by Jointly Learning to Align and Translate (Bahdanau attention mechanism)

  4. Effective Approaches to Attention-based Neural Machine Translation (Luong attention mechanism)