what if set all layers' massive activations to their mean value?
Opened this issue · 1 comments
yyfcc17 commented
interesting work!
i have a question as it in the title, do you conducte an experiment like that? what's the result?
thanks.
Eric-mingjie commented
Hi, Thanks for your interest in our work.
We did not evaluate this setting. However, given that the values of massive activations remain constant across layers (Figure 4), this will likely not affect performance.