/STS-CNN-STSbenchmark-Semantic-Similarity-Semantic-Textual-Similarity-CNN-HCTI-Tensorflow-Keras

A simple implementation of paper "HCTI at SemEval-2017 Task 1: Use convolutional neural network to evaluate semantic textual similarity."

Primary LanguagePython

STSbenchmark

Introduction

This repository contains an implementation of paper "HCTI at SemEval-2017 Task 1: Use convolutional neural network to evaluate semantic textual similarity." https://www.aclweb.org/anthology/S17-2016

A simple CNN approach to get semantic textual similarity scores is proposed in the paper, which got the 3rd place in SemEval-2017 task 1. This implementation is based on python 3.5/Tensorflow 1.7.0/Keras 1.2.2 and can achieve over 0.77 on STSbenchmark datasets.

Citation

If you use these codes for your research, please cite:

@inproceedings{Shao:2017,
	author = {Yang Shao},
	title  = {{HCTI} at SemEval-2017 Task 1: Use convolutional neural network to evaluate Semantic Textual Similarity},
	booktitle = {Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval 2017)},
	year   = {2017},
	address = {Vancouver, Canada},
	publisher = {Association for Computational Linguistics},
	note = {\href{https://github.com/celarex}{URL}},
}

Yang Shao. 2017. HCTI at SemEval-2017 Task 1: Use convolutional neural network to evaluate semantic textual similarity. In Proceedings of SemEval-2017. https://github.com/celarex/Semantic-Textual-Similarity-STSbenchmark-HCTI-CNN

Training

To train your own model, you need to download GloVe pre-trained word vectors from http://nlp.stanford.edu/data/glove.840B.300d.zip first and unzip it in your folder.
To train your CNN model on STSbenchmark datasets:

python sts.py

Enjoy!