Here, 'base' refers to randomly ablating 10 heads, or does it refer to the original OpenCLIP?
Yang-bug-star opened this issue · 1 comments
Yang-bug-star commented
yossigandelsman commented
base is the original model performance, top-random is randomly ablating the same number of head as "ours", repeating this experiment few times, and take the maximum accuracy.