moein-shariatnia/Pix2Seq

Semantic Conflict about variable 'max_len'

JJJYmmm opened this issue · 0 comments

Hi Shariatnia, thanks for your tutorial!
I have some question about variable max_len.
I see max_len first in class Tokenizer,I think the role of it is to limit the maximum number of objects.

labels = labels.astype('int')[:self.max_len]

bboxes = self.quantize(bboxes)[:self.max_len]

But in function collect_fn used for dataloader,I think max_len is to limit the maximum length of input sequence.

if max_len: # [B,max_seq_len,dim] -> [B,max_len,dim]
        pad = torch.ones(seq_batch.size(0), max_len -
                         seq_batch.size(1)).fill_(pad_idx).long()
        seq_batch = torch.cat([seq_batch, pad], dim=1)

I have checked the inputs of the two variables and both came from CFG.max_len,so it's not a coincidence.

I think the variable in the second place should be 5 times that in the first place (excluding eos and bos), because an object corresponds to 5 tokens. I don't know if I'm right,Looking forward to your reply.