/paraphrase_identification

Examine two sentences and determine whether they have the same meaning.

Primary LanguageRich Text FormatMIT LicenseMIT

Paraphrase-Identification-Task

Paraphrase detection is the task of examining two text entities (ex. sentence) and determining whether they have the same meaning. In order to obtain high accuracy on this task, thorough syntactic and semantic analysis of the two text entities is required.

What is Paraphrase?

In simple words, paraphrase is just an alternative representation of the same meaning.

Classification of Paraphrases

According to granularity, paraphrases are of two types.

  • Surface Paraphrases
    • Lexical level
      • Example - solve and resolve
    • Phrase level
      • Example - look after and take care of
    • Sentence level
      • Example - The table was set up in the carriage shed and The table was laid under the cart-shed
    • Discourse level
  • Structural paraphrases
  • Pattern level
    • Example - [X] considers [Y] and [X] takes [Y] into consideration
  • Collocation level
    • Example - (turn on, OBJ ligth) and (switch on, OBJ light)

According to paraphrase style, they can be classified into five types.

  • Trivial Change
    • Example - all the members of and all members of
  • Phrase replacement
    • Example - There will be major cuts in the salaries of high-level civil servants and There will be major cuts in the salaries of senior officials
  • Phrase reordering
    • Example - Last night, I saw TOM in the shopping mall and I saw Tom in the shopping mall last night
  • Sentence split & merge
    • Example - He baught a computer which is very expensive and (1) He bought a computer. (2) The computer is very expensive.
  • Complex paraphrase
    • Example - He said there will be major cuts in the salaries of high-level civil servants and He claimed to implement huge salary cut to senior civil servants

Applications of Paraphrase Identification

  • Machine Translation
    • Simplify input sentences
    • Alleviate data sparseness
  • Question Answering
    • Question reformulation
  • Information Extraction
    • IE pattern expansion
  • Information Retrieval
    • Query reformulation
  • Summarization
    • Sentence clustering
    • Automatic evaluation
  • Natural Language Generation
    • Sentence rewriting
  • Others
    • Changing writing style
    • Text simplification
    • Identifying plagiarism

Relevant Research Topic

  • Textual Entailment
  • Semantic Textual Similarity

Research on Paraphrasing

  • Paraphrase identification
  • Paraphrase extraction
  • Paraphrase generation
  • Paraphrase applications

Paraphrase Identification

  • Specially refers to sentential paraphrase identification
    • Given any pair of sentences, automatically identifies whether these two sentences are paraphrases

Overview of Paraphrase Identification Methods

More discussion on the previous works are documented here.

Reference