NLI_datasets: Datasets on inference

Dataset format:

premise | hypothesis| label

  • year: 1996

  • type: NLI dataset

  • adapted from https://nlp.stanford.edu/~wcmac/downloads/fracas.xml We took P1, ..Pn as premise H as hypothesis label = {'yes': "entailment", 'no': 'contradiction', 'undef': "neutral", 'unknown': "neutral"}

    And we randomly split 80/20 for train/dev.

  • link: https://drive.google.com/open?id=13-gDw3lnxqnqYwXQUdH0bPsPPZXdD5n7


  • year: 2006, 2007, 2009 -> (created from RTE1, RTE2, RTE3, RTE5)
  • type: RTE dataset
  • link: https://drive.google.com/open?id=1-7xH81M__XsKF7Uog5nemoNppabiKqZy


  • year: 21 March 2011 (constructed from http://people.ict.usc.edu/~gordon/copa.html)

  • type: RTE dataset

  • note: I have modified the original dataset as follows: original:

      premise                                |       choice1       |          choice2            |        label
      My body cast a shadow over the grass   |  The sun was rising |     The grass was cut       |          0
       premise                              |    hypothesis     |   label
       My body cast a shadow over the grass | The sun was rising| entailment
       My body cast a shadow over the grass | The grass was cut | not_entailment
  • link: https://drive.google.com/open?id=1UYXo9OQOnu51yjunHGBwSa6Jj_YFoCLR


  • year: January 2012 -> (created from The Winograd Schema Challenge)
  • type: RTE dataset
  • link: https://drive.google.com/open?id=1R1aR18dC4ke1TUSzDl8Pwr4VviDgpX_q



  • year: 2015
  • type: NLI dataset
  • link: https://drive.google.com/open?id=1k1Mj0-vVdGuPkx3BaP_BGCUyRb-97Mx2

Add-one RTE


  • year: 11 october 2016 -> (created from The Standford Question Answering Dataset)
  • type: RTE dataset
  • link: https://drive.google.com/open?id=1_5BK3fWBzQouS8XtLGx57TFQhRlAjRTN


  • year: 18 Apr 2017
  • type: NLI dataset
  • link: https://drive.google.com/open?id=1zW3D9E6uVcKRvAEsS_inI31J4QO1J3po



  • year: 27 NOV 2017
  • type: RTE dataset
  • collect from: http://decomp.io/projects/diverse-natural-language-inference/
  • link: https://drive.google.com/open?id=1vYE0lAMf_G0iLKCV9oE31xxnxQPZbLTv


  • year: 27 NOV 2017
  • type: NLI dataset
  • link: https://drive.google.com/open?id=1oH5YpPg1gkddrrV6TRA5tA7zxnVkjqiX


  • year: 27 Apr 2018
  • type: RTE dataset
  • alteration: I have changed the labels: {"neutral": "not_entailment", "entailment": "entailment"}
  • link: https://drive.google.com/open?id=11VMy9l6RNBXfvgD9_GfMHHj8e95gokel

Commitment Bank