inseq-team/inseq

[Summary] Add perturbation feature attribution methods

gsarti opened this issue ยท 2 comments

๐Ÿš€ Feature Request

The following is a non-exhaustive list of perturbation-based feature attribution methods that could be added to the library:

Method name Source In Captum Code implementation Status
(Layer) Feature Ablation1 - โœ… pytorch/captum
Occlusion Zeiler and Fergus '13 โœ… pytorch/captum โœ…
Shapley Value Sampling Castro et al. '09 โœ… pytorch/captum
Lime Ribeiro et al. '16 โœ… pytorch/captum โœ…
KernelShap Lundberg and Lee '17 โœ… pytorch/captum
Editing 2 - - -
Greedy Rationalization 3 Vafa et al. '21 - keyonvafa/sequential-rationales
Information Bottleneck Jiang et al. '20 - DFKI-NLP/thermostat
BayesLime Slack et al. '21 - dylan-slack/Modeling-Uncertainty-Local-Explainability
BayesSHAP Slack et al. '21 - dylan-slack/Modeling-Uncertainty-Local-Explainability
Input Reduction Feng et al. '18 - -
Input Marginalization Kim et al. '20 - -
Occlusion & Language Modeling Harbecke and Alt '20 - DFKI-NLP/OLM
Context Probing 4 Cรญfka and Liutkus '22 - cifkao/context-probing
Weighted SHAP Kwon and Zou '22 - ykwon0407/WeightedSHAP
Value Zeroing Mohebbi et al. '23 - hmohebbi/ValueZeroing โœ…
Comprehensiveness-as-a-metric Zhou et al. '23 - YilunZhou/solvability-explainer
Sufficiency-as-a-metric Zhou et al. '23 - YilunZhou/solvability-explainer
Causal Tracing Meng et al. '22 - kmeng01/rome
Attention Knockout5 Geva et al. '23 - -
ReAGent Zhao et al. '24 - casszhao/ReAGent โœ…
SyntaxSHAP Amara et al. '24 - k-amara/syntax-shap

Notes:

  1. For more information on Editing, see point 3 in #112 .

Footnotes

  1. Called ablation, but perform masking of features using a baseline.
  2. Editing replaces tokens with their nearest neighbors in the vocabulary embedding space and measures saliency as the drop in performance for the target. In the future, this can allow users to specify a custom editing strategy via an input Callable.
  3. Possibly overlapping with feature ablation up to some measure.
  4. Valid only for decoder-only models.
  5. Verify whether it would be exactly equivalent to Value Zeroing, include only if functionally different (alias otherwise).

Added to method table!