JialianW/TraDeS

RuntimeError: The size of tensor a (113) must match the size of tensor b (112) at non-singleton dimension 2

anthonyweidai opened this issue · 5 comments

I don't know why this is happening when training my own custom dataset. Could you get me some advice?

loading annotations into memory...
Done (t=0.05s)
creating index...
index created!
Creating video index!
Loaded Custom dataset 283 samples
Starting training...
/home/mars/anaconda3/envs/trades/lib/python3.6/site-packages/torch/nn/functional.py:718: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at /opt/conda/conda-bld/pytorch_1623448272031/work/c10/core/TensorImpl.h:1156.)
return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)
Traceback (most recent call last):
File "main.py", line 101, in
main(opt)
File "main.py", line 70, in main
log_dict_train, _ = trainer.train(epoch, train_loader)
File "/home/mars/TraDes/src/lib/trainer.py", line 364, in train
return self.run_epoch('train', epoch, data_loader)
File "/home/mars/TraDes/src/lib/trainer.py", line 194, in run_epoch
output, loss, loss_stats = model_with_loss(batch)
File "/home/mars/anaconda3/envs/trades/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/mars/anaconda3/envs/trades/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 168, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "/home/mars/anaconda3/envs/trades/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 178, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/home/mars/anaconda3/envs/trades/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 86, in parallel_apply
output.reraise()
File "/home/mars/anaconda3/envs/trades/lib/python3.6/site-packages/torch/_utils.py", line 425, in reraise
raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in replica 0 on device 0.
Original Traceback (most recent call last):
File "/home/mars/anaconda3/envs/trades/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 61, in _worker
output = module(*input, **kwargs)
File "/home/mars/anaconda3/envs/trades/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/mars/TraDes/src/lib/trainer.py", line 143, in forward
outputs = self.model(batch['image'], pre_img, pre_hm, addtional_pre_imgs, addtional_pre_hms)
File "/home/mars/anaconda3/envs/trades/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/mars/TraDes/src/lib/model/networks/base_model.py", line 164, in forward
cur_feat = self.img2feats(x)
File "/home/mars/TraDes/src/lib/model/networks/dla.py", line 602, in img2feats
x = self.base(x)
File "/home/mars/anaconda3/envs/trades/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/mars/TraDes/src/lib/model/networks/dla.py", line 294, in forward
x = getattr(self, 'level{}'.format(i))(x)
File "/home/mars/anaconda3/envs/trades/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/mars/TraDes/src/lib/model/networks/dla.py", line 221, in forward
x1 = self.tree1(x, residual)
File "/home/mars/anaconda3/envs/trades/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/mars/TraDes/src/lib/model/networks/dla.py", line 221, in forward
x1 = self.tree1(x, residual)
File "/home/mars/anaconda3/envs/trades/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/mars/TraDes/src/lib/model/networks/dla.py", line 63, in forward
out += residual
RuntimeError: The size of tensor a (113) must match the size of tensor b (112) at non-singleton dimension 2

Screenshot from 2021-09-16 12-40-15
This bug is in dla.py

Try to make the image size be evenly divided by 32?

What kind of function should I use? Rescale in pytorch transforms? Put it in torch.utils.data.DataLoader? I'm a newcomer in DL. I don't know much about image pre-processing. Can you give me more details about how to realise it?