This is a chat corpus collection from various open sources,
all files are composed of question-answer pairs,
where odd lines are questions, even lines are answers.
I use them for training chatbot on seq2seq model.
theory: http://arxiv.org/abs/1406.1078
implementation: https://github.com/Marsan-Ma/tf_chatbot_seq2seq_antilm.git
English movie subtitles parsed from
http://opus.lingfil.uu.se/download.php?f=OpenSubtitles/en.tar.gz
Cornell Movie-Dialogs Corpus
http://www.mpi-sws.org/~cristian/Cornell_Movie-Dialogs_Corpus.html
lyrics from PTT forum
https://www.ptt.cc/bbs/lyrics/index.html