Definition of ` discrete_expl_th_token_ids` if `removal_args["remove_tokens"] == False`
Closed this issue · 1 comments
phiwi commented
- ferret version: 0.4.1
- Python version: 3.9
- Operating System: Linux
Description
When you define sample[id_top] = self.tokenizer.mask_token_id
(line 229) in ferret/evaluators/faithfulness_measures.py, shouldn't there the non-id_top
tokens been masked out (as we're computing sufficiency at this point) such that code should be altered to
sample[~id_top] = self.tokenizer.mask_token_id # adding the tilde to exert negation
?
elianap commented
Thank you for noticing it!
Yes, in the case of using the mask token for removal rather than removing the word (i.e., when removal_args["remove_tokens"] == False) for sufficiency we want to mask the tokens not in the 'id_top', so that we preserve just the most important tokens.
We fix it with #28!
It will also be available in the next release of ferret.