lucidrains/linear-attention-transformer

Transformer based on a variant of attention that is linear complexity in respect to sequence length

PythonMIT

Issues

[Feature request] Self-attention with Persistent Memory
#21 opened 5 months ago by MarcusLoppe
1
Image linear attention reference
#19 opened 9 months ago by pravn
0
Why dim != dim_head * heads?
#18 opened 9 months ago by zzczzc20
0
How to perform training?
#17 opened 2 years ago by pangshengwei
0
Is the causal attention really works here?
#16 opened 2 years ago by charlesxu90
2
Scaling factors
#14 opened 2 years ago by radandreicristian
1
Tooooo many functions added, but no annotations
#15 opened 2 years ago by charlesxu90
0
Questions on the implementation of a linear variant and reference
#8 opened 4 years ago by scaomath
1
ImageLinearAttention showcase
#11 opened 3 years ago by monajalal
0
Challenge in replacing SelfAttention with ImageLinearAttention in Vision Transformer
#13 opened 3 years ago by monajalal
0
.
#12 opened 3 years ago by monajalal
0
Causal linear attention from which paper ? please tell me thx
#10 opened 3 years ago by hquzhuguofeng
0
dalle
#9 opened 3 years ago by adamonkey
1
Loss returns Nan
#6 opened 4 years ago by terencenwz
3
Where does this constant come from?
#7 opened 4 years ago by aluo-x
1
causal = True
#5 opened 4 years ago by wajihullahbaig
2
Autopadder doesn't work with LinearAttentionTransformer
#4 opened 4 years ago by jamarju
1
Positional encoding?
#3 opened 4 years ago by matthew-jurewicz
2
[Question] Merging with Trans-XL?
#2 opened 4 years ago by gaceladri
40
seq2seq decoder ids
#1 opened 4 years ago
4