question about RotaryPEMultiHeadAttention: rotary_percentage
YOONSEOKHEO opened this issue · 0 comments
YOONSEOKHEO commented
I confirmed that there is code in the RotaryPEMultiHeadAttention class that reduces the dimension using a parameter called rope_percentage.
(URL:
I am curious in what cases you would set rope_percentage to a value less than 1.
(Of course, in experiment.py, we confirmed that rope_percentage is set to 1.0.)