lucidrains/h-transformer-1d

Implementation of H-Transformer-1D, Hierarchical Attention for Sequence Learning

PythonMIT

Issues

error in test
#25 opened 10 months ago by jizhang02
1
Billion Word Benchmark - Reproducibility
#24 opened a year ago by DavidHerel
0
Use for relatively short sequences and small datasets
#23 opened 2 years ago by Vedasheersh
0
Number of processed inputs in causal version is off
#22 opened 2 years ago by vladyorsh
2
Approximated values are off
#20 opened 3 years ago by jglaser
2
Masking not working in training, thanks
#18 opened 3 years ago by junyongyou
6
Sequence classfication, thanks a lot
#17 opened 3 years ago by junyongyou
0
Application to sequence classification?
#12 opened 3 years ago by trpstra
4
Multiple applications of positional embeddings?
#14 opened 3 years ago by Clemens123
1
Add Norm Missing
#16 opened 3 years ago by wwx13
2
Mask not working
#15 opened 3 years ago by wwx13
2
Algorithm Mismatch
#13 opened 3 years ago by jinmang2
3
Does its model include relative position embedding?
#11 opened 3 years ago by hadaev8
1
eos token does not work in batch mode generation
#9 opened 3 years ago by tmphex
4
One simple question
#10 opened 3 years ago by CiaoHe
2
Does h-transformer-1d have a subquadratic cost?
#7 opened 3 years ago by jinmang2
1
RuntimeError: Tensor type unknown to einops <class 'torch.return_types.max'>
#6 opened 3 years ago by wajihullahbaig
4
ModuleNotFoundError: No module named 'rotary_embedding_torch'
#5 opened 3 years ago by wajihullahbaig
1
Sequence length issue when `causal = True`
#8 opened 3 years ago by jaak-s
1
I have some questions about implementation details
#1 opened 3 years ago by jinmang2
3
Mini-batching (b > 1) does not work with masking
#4 opened 3 years ago by jaak-s
2
Example in README does not work
#3 opened 3 years ago by jaak-s
2
H-Transformer for Cross-Attention?
#2 opened 3 years ago by Vbansal21
4