why dropping last example with patch_src&patch_trg function @train.py
pluspluswu opened this issue · 1 comments
pluspluswu commented
hi,
I'm trying to train this model on my own dataloader, but when my data iter get batch first data of size [batch_size, seq_len ], function patch_trg @train.py returns trg_seq /with size [seq_len, batch_size - 1]/ and gold /*with size seq_len (batch(size-1)/; I can take the transpose to better fit matrix calcluate. BUT, why does this function drop last example for each batch ? or am I wrong with the usage of these patch functions?
def patch_src(src, pad_idx):
src = src.transpose(0, 1)
return src
def patch_trg(trg, pad_idx):
trg = trg.transpose(0, 1)
trg, gold = trg[:, :-1], trg[:, 1:].contiguous().view(-1)
return trg, gold
perhaps, is it possible that i should do this change to better fit my own data which is batch first?
def new_patch_trg(trg, pad_idx):
trg, gold = trg[:, :-1], trg[:, 1:].contiguous().view(-1)
trg = trg.transpose(0, 1)
return trg, gold
Thanks
AlessandroMondin commented
Explanation is here