lucidrains/local-attention
An implementation of local windowed attention for language modeling
PythonMIT
Issues
- 6
- 1
May I be allowed to delete einops and replace it with the operations provided by torch?
#20 opened by wencan - 3
Which is exactly the attention pattern?
#11 opened by beleen23 - 0
- 0
LocalTransformer Encoder Layer
#19 opened by AmitMY - 1
The look_around function seems to be incorrect
#18 opened by datvuthanh - 2
About the performance
#17 opened by ThyrixYang - 0
Attention weight
#16 opened by emanuele-mincato - 1
Wrong shape for attention bias vs sim tensor
#15 opened by inspirit - 5
xPos Rotary Embeddings
#14 opened by ilya16 - 1
- 1
- 1
More control over attention masking
#9 opened by Mindful - 1
- 3
- 2
- 2
- 2
question about the look around operation
#2 opened by benywon - 0
question about the local attention
#1 opened by benywon