/robust-nli

Repository for code from "On Adversarial Removal of Hypothesis-only Bias in Natural Language Inference" (StarSem 2019) and "Don’t Take the Premise for Granted: Mitigating Artifacts in Natural Language Inference" (ACL 2019)

Primary LanguagePython

Robust NLI Using Adversarial Learning

Training NLU models robustly to ignore annotations artificats that allow hypothesis only models to outperform majority baselines. The goal is to be able to train NLI models on datasets with annotation artifcats and then perform well on different datasets that do not contain those artifacts.

Requirements

All code in the repo relies on python2.7 and anaconda2.

To create a conda enviornment with all required packages, run conda env create -f environment.yml

This project relies on pytorch and is based on InferSent.

Data

We provide a bash script that can be used to downlod all data used in our experiments. The script also cleans and processes the data. To get and process the data, go in data and run ./get_data.sh.

Training

To train a hypothesis-only NLI model, use src/train.py.

All command line arguments are initialized with default values. If you ran get_data.sh as described above, all of the paths will be set directly and you can just run src/train.py.

The most useful command line arguments are:

  • embdfile - File containin the word embeddings
  • outputdir - Output directory to store the model after training
  • train_lbls_file NLI train data labels file
  • train_src_file NLI train data source file
  • val_lbls_file NLI validation (dev) data labels file
  • val_src_file NLI validation (dev) data source file
  • test_lbls_file NLI test data labels file
  • test_src_file NLI test data source file
  • remove_dup 1 to remove duplicate hypothesis from train, 0 to keep them in. 0 is the default

Adversarial Learning Hyper-parameters

  • adv_lambda Controls the loss weight of the hypothesis only classifier.
  • adv_hyp_encoder_lambda Controls the adversarial weight for the hypothesis only encoder
  • nli_net_adv_hyp_encoder_lambda Controls the adversarial weight for the hypothesis encoder in NLI net
  • random_premise_frac Controls the fraction of randome premises to use in NLI net
Mapping to hyper-parameters in the papers

In Don't Take the Premise for Granted (ACL):

  • alpha refers to adv_hyp_encoder_lambda (Method 1) and random_premise_frac (Method 2)
  • beta refers to adv_lambda (Method 1) and nli_net_adv_hyp_encoder_lambda (Method 2)

In On Adversarial Removal of Hypothesis-only Bias (StarSem):

  • λLoss refers to adv_lambda
  • λEnc refers to adv_hyp_encoder_lambda
  • λRand refers to random_premise_frac
  • λRandAdv refers to nli_net_adv_hyp_encoder_lambda

To see a description of more command line arguments, run src/train.py --help.

Hyper-parameters for transfer experiments

These are the hyper-parameter values for the transfer experiments reported in table 2 of our ACL paper:

test set adv_lambda adv_hyp_encoder_lambda random_premise_frac nli_net_adv_hyp_encoder_lambda
SNLI test 0.1 0.2 0.05 0.05
SNLI hard 0.1 0.2 0.05 0.05
GLUE 1 0.05 0.1 0.05
MNLI mismatched 1 0.05 0.1 0.05
MNLI matched 0.4 0.1 0.1 0.05
JOCI test 0.8 0.05 0.05 0.05
MPE test 0.1 1 0.05 0.2
SICK test 0.1 1 0.1 0.05
ADD-ONE-RTE test 0.8 0.4 0.8 1
SCITAIL test 0.05 0.8 0.1 0.1
DPR test 1 0.2 0.05 0.4
SPRL test 1 1 1 1
FNPLUS test 0.8 1 0.2 0.2

Bibligoraphy

If you use this repo, please cite the two following papers:

@inproceedings{belinkov-etal-2019-dont,
    title = "Don{'}t Take the Premise for Granted: Mitigating Artifacts in Natural Language Inference",
    author = "Belinkov, Yonatan  and Poliak, Adam  and Shieber, Stuart  and Van Durme, Benjamin  and Rush, Alexander",
    booktitle = "Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics",
    address = "Florence, Italy",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/P19-1084"
}

@inproceedings{belinkov-etal-2019-adversarial,
    title = "On Adversarial Removal of Hypothesis-only Bias in Natural Language Inference",
    author = "Belinkov, Yonatan  and Poliak, Adam  and Shieber, Stuart  and Van Durme, Benjamin  and Rush, Alexander",
    booktitle = "Proceedings of the Eighth Joint Conference on Lexical and Computational Semantics (*{SEM} 2019)",
    address = "Minneapolis, Minnesota",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/S19-1028"
}