/VDPWI-NN-Torch

Very Deep Pairwise Word Interaction Neural Networks for modeling textual similarity (He and Lin, NAACL/HLT 2016)

Primary LanguageLua

Very-Deep Pairwise Word Interaction Neural Networks for Modeling Textual Similarity

NOTE: This repo contains code for the original Torch implementation from the NAACL 2016 paper. The code is not being maintained anymore and has been superseded by a PyTorch reimplementation in Castor. This repo exists solely for archival purposes.

This repo contains the Torch implementation of the very-deep pairwise word interaction neural network for modeling textual similarity, as described in the following paper:

This model does not require external resources such as WordNet or parsers, does not use sparse features, and achieves good accuracy on standard public datasets.

Installation and Dependencies

  • Please install Torch deep learning library. We recommend this local installation which includes all required packages our tool needs, simply follow the instructions here: https://github.com/torch/distro

  • Currently our tool only runs on CPUs, therefore it is recommended to use INTEL MKL library (or at least OpenBLAS lib) so Torch can run much faster on CPUs.

  • Our tool then requires Glove embeddings by Stanford. Please run fetch_and_preprocess.sh for downloading and preprocessing this data set (around 3 GBs).

Running

  • Command to run (training, tuning and testing all included):
  • th trainSIC.lua

The tool will output pearson scores and also write the predicted similarity scores given each pair of sentences from test data into predictions directory.