lucidrains/x-transformers

A simple but complete full-attention transformer with a set of promising experimental features from various papers

PythonMIT

Issues

Problem with cache and memory
#255 opened 12 days ago
0
Enable flash attention does not support BFloat16?
#254 opened 15 days ago
1
How to use "src_key_padding_mask"
#253 opened 12 days ago
2
Sinusoidal embedding order choice different from original definition
#252 opened 25 days ago
1
RoPE inconsistency (2-dim subspaces choice)
#250 opened a month ago
0