graykode/nlp-tutorial

If create the Linear() casually, it won't be trained during training.

yangdechuan opened this issue · 1 comments

output = nn.Linear(n_heads * d_v, d_model)(context)

Thanks for sharp caught Could you contribute for that? Also it contains jupyter file corresponding to code