See the original code of Stacked Attention Networks implemented in Theano here Requirements Keras Python 2.7