/chatbot

A chatbot built of a seq2seq RNN model with an added attention mechanism to the decoder 🤖

Primary LanguagePython

Chatbot using Recurrent Neural Networks with Attention

Setup

You need to have PyTorch installed. Run main to try out the model. If you would like to train it, remove the comments in line 55 and 56 and edit config.yaml 'checkpoint_iter' to 0. You will find all different kind of settings there. Download the pre-trained model from here. Easiest thing would be to just copy it and add it with the readme file in the data/save folder

Dataset

The Cornell Movie-Dialog Corpus is a rich dataset of movie character dialog:

  • 220,579 conversational exchanges between 10,292 pairs of movie characters
  • 9,035 characters from 617 movies
  • 304,713 total utterances

Download the ZIP file and put it in a data/ directory under the current directory.

Model

The model used in this chatbot is a sequence-to-sequence (seq2seq) model. Here I use two seperate recurrent neural nets together. One of them acts as an encoder, which encodes a variable length input sequence to a fixed-length context vector. In theory, this context vector (the final hidden layer of the RNN) will contain semantic information about the query sentence that is input to the bot. The second RNN is a decoder, which takes an input word and context vector, and returns a guess for the next word in the sequence and a hidden state to use in the next iteration.

Encoder-Decoder
Image source

Attention mechanism is also added to the decoder. There are several ways of doing this, but this chatbot is using the one presented by Luong. It uses global attention, which means that instead of just considering the current hidden state from the encoder, it considers all of the encoders hidden states. A simple figure explaining how attention weights are applied to the encoder outputs to produce the attended decoder inputs is shown in the image under.

Attention

Results

Loss

Chat:

I am home

Bot: ok . is there . . .

you could be more happy about it

Bot: i know . . . !

so smile then

Bot: i m trying to watch it up .

ok, could you help me with something?

Bot: you re in the city . . .

we could talk over video

Bot: oh . . . . .

you know how to do that?

Bot: not a man . . . .

ok.

Bot: ok . how s the phone soda ?

stop, could you help me with this task?

Bot: hmm... i am not sure I can help you with that.

why not?

Bot: because i have to ! ! !

you do not have to

Bot: why not ? ? ? ?

I can do it myself

Bot: why ? ? ? . .

I need to deliver i soon

Bot: i will . . . .

you will help?

Bot: it s possible . . . .

let me know then

Bot: no

are you stupid?

Bot: no . . times . .

times what?

Bot: your butter wine was a suspect .

now you are talking weird

Bot: they re both they re dying . !

did you kill them?

Bot: doesn t matter . ? ? ?

YES!

Bot: and you see that ? ? ?

i did not see it, i promise

Bot: you did . . . !

I have to go now

Bot: not for you ? ? ?

bye

Bot: wait ! . . .

References

  • Inkawhich Matthew. Chatbot tutorial. https://pytorch.org/tutorials/beginner/chatbot_ tutorial.html. [Online; accessed 10-April-2020].
  • Oriol Vinyals and Quoc V. Le. A neural conversational model. CoRR, abs/1506.05869, 2015.
  • Minh-Thang Luong, Hieu Pham, and Christopher D. Manning. Effective approaches to attention-based neural machine translation. CoRR, abs/1508.04025, 2015.
  • Cristian Danescu-Niculescu-Mizil and Lillian Lee. Chameleons in imagined conversations: A new approach to understanding coordination of linguistic style in dialogs. In Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics, ACL 2011, 2011.
  • Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. Sequence to sequence learning with neural networks. CoRR, abs/1409.3215, 2014.