batch size problem

Question

batch size problem

QiRenn opened this issue a year ago · 1 comments

I am training on an A800, and initially training with a batch size of 1 was successful. However, when I changed the batch size to 2, an issue arose.

cost_cls = compute_label_cost(outputs, targets, self.num_classes)

File "SOC-main/models/matcher.py", line 175, in compute_label_cost
cost_class_splits = torch.stack(cost_class_splits, dim=-1) #stack in the instances size
RuntimeError: stack expects each tensor to be equal size, but got [8, 40] at entry 0 and [7, 40] at entry 1

May I ask what this is about?

Answer 1 · 2024-01-01T07:34:03.000Z

Thanks for your attention on our work!
The current uploaded version of the code only supports batchsize=1 when training with Ref-Youtube-VOS. The error is caused when either one batch contains frame where the object is None. If you want to train with more batches, you can create a zero tensor ([t, bnq]) and calcuate the costs of per batch then aggergate them or take the mean of the t first then stack. However, we find that both of them will cause performance drop. Then we decide to train only with one batch.