lucidrains/FLASH-pytorch

Implementation of the Transformer variant proposed in "Transformer Quality in Linear Time"

PythonMIT

Issues

I would like to ask if your model can be applied to other text classification tasks?
#14 opened 6 months ago by ZoeLct
0
About the "/n"
#13 opened 10 months ago by kj01239876
0
AttributeError: module 'torch' has no attribute 'special'
#12 opened a year ago by bibo-msft
2
The speed.
#11 opened a year ago by wangyuxin87
0
Speed on TPU
#6 opened 2 years ago by magicknight
1
mask error
#1 opened 3 years ago by keyunluo
6
Is it a typo in FLASH module?
#10 opened a year ago by marsggbo
1
About the "shift_tokens"
#5 opened 2 years ago by kangzhao2
2
rel_pos_bias in GAU
#9 opened 2 years ago by SunderlandAJ-1130
1
Laplace Activation Function Implementation
#7 opened 2 years ago by boweny-cerebras
1
About negative values in my input sentence embeddings
#8 opened 2 years ago by justinwoo97
0
Cross-Attention?
#4 opened 2 years ago by amorehead
2
einsum operation in Linear Attention Part
#2 opened 2 years ago by ShomyLiu
5