Threshold for Attention Vector during comparison
jaydebsarker opened this issue · 2 comments
I am adding two lines here from : https://github.com/hila-chefer/Transformer-Explainability/blob/main/BERT_explainability.ipynb
i) expl = explanations.generate_LRP(input_ids=input_ids, attention_mask=attention_mask, start_layer=0)[0]
normalize scores
ii) expl = (expl - expl.min()) / (expl.max() - expl.min())
You normalized the explanation vector score in line ii) here. After the normalization, the most significant one/more got a score of 1.0./-1.0. Other than that, some other tokens may get 0.7 or 0.6, etc. In that case, which tokens are considered as predicted class (i.e., negative sentiment) from the model output? To be specific, did you put a threshold (i.e., >=0.5) for each token to align a specific class?
I was wondering if you would clarify this for me.
Hi @jaydebsarker, thanks for your interest!
I apologize for the delay in my response.
The normalization sets the values between 0 and 1 for visualization purposes.
I do not set a threshold, but if you wish to set one, I recommend using Otsu’s method for that (as done in our second paper).
Best,
Hila.