
paraphrase models using Twitter as data resource

This repository contains the data used in the following paper:

  title      = {Gathering and generating paraphrases from {Twitter} with application to normalization},
  author     = {Xu, Wei and Ritter, Alan and Grishman, Ralph},
  booktitle  = {Proceedings of the Sixth Workshop on Building and Using Comparable Corpora (BUCC)},
  year       = {2013},
  url        = {http://aclweb.org/anthology/W/W13/W13-2515.pdf}

The repository https://github.com/cocoxu/multip contains the source code of the Multiple-instance Learning Paraphrase (MultiP) Model in the following paper:

  author =  {Wei Xu and Alan Ritter and Chris Callison-Burch and William B. Dolan and Yangfeng Ji},
  title =   {Extracting Lexically Divergent Paraphrases from {Twitter}},
  journal = {Transactions of the Association for Computational Linguistics (TACL)},
  volume =  {2},
  number =  {1},
  year =    {2014},
  url = {http://www.cis.upenn.edu/~xwe/files/tacl2014-extracting-paraphrases-from-twitter.pdf}