Some code for doing language modeling with Keras, in particular for question-answering tasks. I wrote a very long blog post that explains how a lot of this works, which can be found here.
attention_lstm.py
: Attentional LSTM, based on one of the papers referenced in the blog post and others. One application used it for image captioning. It is initialized with an attention vector which provides the attention component for the neural network.insurance_qa_eval.py
: Evaluation framework for the InsuranceQA dataset. To get this working, clone the data repository and set theINSURANCE_QA
environment variable to the cloned repository. Changingconfig
will adjust how the model is trained.keras-language-model.py
: TheLanguageModel
class uses theconfig
settings to generate a training model and a testing model. The model can be trained by passing a question vector, a ground truth answer vector, and a bad answer vector tofit
. Thenpredict
calculates the similarity between a question and answer. Override thebuild
method with whatever language model you want to get a trainable model. Examples are provided at the bottom, including theEmbeddingModel
,ConvolutionModel
, andRecurrentModel
.word_embeddings.py
: A Word2Vec layer that uses the embeddings generated by Gensim's word2vec model to provide vectors in place of the KerasEmbedding
layer, which could help improve convergence, since fewer parameters need to be learned. Note that this requires generating a separate file with the word2vec weights, so it doesn't fit in very nicely with the Keras architecture.