locuslab/massive-activations

Code accompanying the paper "Massive Activations in Large Language Models"

PythonMIT

Issues

the standard deviation of the activation
#7 opened 3 months ago by Cooperx521
3
Thoughts about ViT register and anchor token
#6 opened 3 months ago by Cooperx521
3
Training only on 2B tokens (openwebtext)
#5 opened 6 months ago by Nandan91
3
what if set all layers' massive activations to their mean value?
#4 opened 7 months ago by yyfcc17
1
How to get the mean value of massive activation
#3 opened 7 months ago by pengyao96
1
Which layer's activation is used?
#1 opened 7 months ago by iyupan
1