This repository contains experiments on the WikiQA dataset mainly for investigating answer triggering - finding the relevant answer out of a set of given answers to a question.
The base CNN implementation is from Deep learning for answer sentence selection. The logic of using CNN for Sentence selection using word vectors is inspired from Yoon Kim's - CNN for Sentence Classification The dataset is read from data/ folder and currently hardcoded in the last-line of the text. Uses keras for CNN.
The current implementation maps the question and answer vectors into a higher dimension using CNN and compares them using dot product or a fully connected layer.
CNN statistics
Parameter | Value |
---|---|
Filter Size | 2,3,4 |
Convolution | Levels |
Word vector | dimensions |
Dense Layer | Dimensions |
Pooling | Max Pooling or Average Pooling |
Below diagram shows the cnn-architecture taken from Deep learning for answer sentence selection but involved max pooling also alongwith filters of dimension 3 and 4, in addition to dimension 2.
The output of CNN is connected to a Logistic Layer which classifies the answer to the question as relevant(1) or non-relevant(0)
Because the dataset is highly skewed( One true answer to 5-6 false answers to a question ), RandomOversampling is done to balance the set.