Issues
- 0
- 0
W_o matrix in multiheadattention
#23 opened - 1
- 0
- 0
Not an issue, but a re-implementation
#20 opened - 2
translate.py not consistent
#19 opened - 3
only spaces get predicted
#18 opened - 1
- 0
- 1
Quality after 20 epoch training
#15 opened - 2
- 0
Fourier Positional Embeddings
#13 opened - 1
Truncation for Tokenizer
#12 opened - 1
forward method in transformer class
#10 opened - 3
dataset
#9 opened - 3
- 4
Question: train.py
#5 opened - 1
multihead attention biases
#4 opened - 2
Position Encoding
#2 opened