A Multilingual Perspective Towards the Evaluation of Attribution Methods in Natural Language Inference

In this work, we present a multilingual approach for evaluating attribution methods for the Natural Language Inference (NLI) task in terms of plausibility and faithfulness properties. First, we introduce a novel cross-lingual strategy to measure faithfulness based on word alignments, which eliminates the potential downsides of erasure-based evaluations. We then perform a comprehensive evaluation of attribution methods, considering different output mechanisms and aggregation methods. Finally, we augment the XNLI dataset with highlight-based explanations, providing a multilingual NLI dataset with highlights, which may support future exNLP studies. Our results show that attribution methods performing best for plausibility and faithfulness are different.

Experiments

All experiments in the paper can be reproduced with the notebooks we provide.

e-XNLI dataset

Dataset can be found under data folder, validation and training splits will be released soon.

AnReu/explaiNLI

A Multilingual Perspective Towards the Evaluation of Attribution Methods in Natural Language Inference

Experiments

e-XNLI dataset