TA-Seq2Seq This project is built on Theano 0.9, python 2.7 and Blocks(https://github.com/mila-udem/blocks). Please make sure they are installed before running this project. Step 1: preparing the data This project requires 3 vocabularies, the query vocabulary, response vocabulary and topic vocabulary. You should build every vocabulary as a dictionary like {'I': 0, 'UNK': 1, 'a': 2, 'student': 3, '</s>': 4} and save it as an pkl file. This project also requires a query file, a response file and a topic word file, in which the query, response and topic word list attached of a case are saved separately in the same line of the three files. Step 2: checking the configurations Please refer to the function topicAawareJPData() in configurations_base.py as an example of how to write configuration of your experiment. Let me explain some important features: The 'topic_vocab_output' and 'topic_vocab_output' are set as the same topic vocabulary built beforehand. 'topic_embeddings' is the embedding matrix of all topic words in which the i-th row is the embedding of the i-th word in the topic word vocabulary. 'topical_word_num' is the number of topic words attached for every query (number of words in every line of topic word file). 'tw_vocab_overlap' is a one-hot matrix that maps topic words with their numbers in the response vocabulary. A simple case is as follows, I UNK a student </s> student(topic word)[[0 0 0 1 0] a(topic word) [0 0 1 0 0]] Step 3: Run!