yossigandelsman/clip_text_span

Here, 'base' refers to randomly ablating 10 heads, or does it refer to the original OpenCLIP?

Yang-bug-star opened this issue · 1 comments

image
What is the "base" here ? Here, 'base' refers to randomly ablating 10 heads, or does it refer to the original OpenCLIP? In the original paper, it is said to take randomly ablating 10 heads as the baseline.

base is the original model performance, top-random is randomly ablating the same number of head as "ours", repeating this experiment few times, and take the maximum accuracy.