Given a premise-hypothesis pair, NLI is the task of understanding whether the premise entails the hypothesis, whether it contradicts the hypothesis or neither (the relationship is neutral).
Cross-lingual NLI involves training a natural language inference model in a language and predicting entailment labels for data in another language. For example, in this project, we train an NLI model on MultiNLI data - which is available only in English - and evaluate it for use in other languages.
Holds the notebook that explains how a target language encoder is aligned to a source language encoder using parallel corpora, and how we use the aligned encoder to perform cross-lingual NLI without translation
. Also, below is an explanation of the whole process.
In this folder you can find the notebook and .py files with directions for training an SNLI model
- English-only.
Here, you can find the notebook and .py files with directions for training a MultiNLI model
- English-only.
Translate-Train method uses machine translation to generate training sets in the XNLI dev and test languages. Here we use the machine-translated training sets provided on XNLI repo to reproduce the translate-train results of Conneau et al. (2018). We use this method as our primary baseline.
Translate-Test method involves translating the development and test sets to the training/source language (English). Machine-translated dev and test sets are also provided on XNLI repo. We reproduce the translate-test results to use as our secondary baseline.
How does it work? Let's say our goal is to perform NLI in German without translating the training set to German (or the test set to English). Each experiment consists of three following steps:
1) Training on English NLI Data: We first train an LSTM encoder and a linear classifier on MultiNLI data. Then we make a copy of the encoder, so that we have two identical encoders; one for the source language (En) and one for the target language (De).
2) Aligning Encoders: Then, by using parallel sentence pairs in English and German (from Europarl or OpenSubtitles 2018 corpora), we align the German encoder to the English encoder so that they produce close sentence representations in the embedding space. We use an adversarial objective in addition to the alignment loss proposed by Conneau et al. (2018). Specifically, we try to fool a discriminator at the same time with alignment. We incorporate adversarial training to the process, since our goal is to produce close embeddings in the space so that a linear classifier trained for NLI is not able to tell the difference between English embeddings and German embeddings. This way, we can perform cross-lingual NLI without needing translation.
3) Inference on XNLI (Non-English NLI Data): We build cross-lingual NLI models by training on MultiNLI and aligning encoders for the following languages:
en: English, ar: Arabic, bg: Bulgarian, de: German, el: Greek, es: Spanish, fr: French, hi: Hindi, ru: Russian, th: Thai, tr: Turkish, vi: Vietnamese, & zh: Chinese.
curl -o wiki.en.align.vec https://dl.fbaipublicfiles.com/fasttext/vectors-aligned/wiki.<lang>.align.vec
Alternatively, you can use multilingual fastText vectors.
pytorch
nltk: for standard English tokenizer
jieba: for Chinese tokenizer
spacy