meetps/pytorch-semseg

RuntimeError: The size of tensor a (33) must match the size of tensor b (34) at non-singleton dimension 2

FamiliennameistChow opened this issue · 3 comments

i try to run icnet with pascalvoc VOC2012, BUT i met this error:

train.py:230: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
  cfg = yaml.load(fp)
RUNDIR: runs/icnet_pascal/84704
Exception ignored in: <bound method _DataLoaderIter.__del__ of <torch.utils.data.dataloader._DataLoaderIter object at 0x7fdc35894278>>
Traceback (most recent call last):
  File "/home/xy/anaconda3/envs/pytorch_semseg/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 399, in __del__
    self._shutdown_workers()
  File "/home/xy/anaconda3/envs/pytorch_semseg/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 378, in _shutdown_workers
    self.worker_result_queue.get()
  File "/home/xy/anaconda3/envs/pytorch_semseg/lib/python3.6/multiprocessing/queues.py", line 337, in get
    return _ForkingPickler.loads(res)
  File "/home/xy/anaconda3/envs/pytorch_semseg/lib/python3.6/site-packages/torch/multiprocessing/reductions.py", line 151, in rebuild_storage_fd
    fd = df.detach()
  File "/home/xy/anaconda3/envs/pytorch_semseg/lib/python3.6/multiprocessing/resource_sharer.py", line 57, in detach
    with _resource_sharer.get_connection(self._id) as conn:
  File "/home/xy/anaconda3/envs/pytorch_semseg/lib/python3.6/multiprocessing/resource_sharer.py", line 87, in get_connection
    c = Client(address, authkey=process.current_process().authkey)
  File "/home/xy/anaconda3/envs/pytorch_semseg/lib/python3.6/multiprocessing/connection.py", line 493, in Client
    answer_challenge(c, authkey)
  File "/home/xy/anaconda3/envs/pytorch_semseg/lib/python3.6/multiprocessing/connection.py", line 732, in answer_challenge
    message = connection.recv_bytes(256)         # reject large message
  File "/home/xy/anaconda3/envs/pytorch_semseg/lib/python3.6/multiprocessing/connection.py", line 216, in recv_bytes
    buf = self._recv_bytes(maxlength)
  File "/home/xy/anaconda3/envs/pytorch_semseg/lib/python3.6/multiprocessing/connection.py", line 407, in _recv_bytes
    buf = self._recv(4)
  File "/home/xy/anaconda3/envs/pytorch_semseg/lib/python3.6/multiprocessing/connection.py", line 383, in _recv
    raise EOFError
EOFError:
Traceback (most recent call last):
  File "train.py", line 242, in <module>
    train(cfg, writer, logger)
  File "train.py", line 140, in train
    outputs = model(images)
  File "/home/xy/anaconda3/envs/pytorch_semseg/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/xy/anaconda3/envs/pytorch_semseg/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 121, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "/home/xy/anaconda3/envs/pytorch_semseg/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/xy/disk2/zhoubo/pytorch_semsge/ptsemseg/models/icnet.py", line 210, in forward
    x_sub12, sub24_cls = self.cff_sub12(x_sub24, x_sub1)
  File "/home/xy/anaconda3/envs/pytorch_semseg/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/xy/disk2/zhoubo/pytorch_semsge/ptsemseg/models/utils.py", line 764, in forward
    high_fused_fm = F.relu(low_fm + high_fm, inplace=True)
RuntimeError: The size of tensor a (33) must match the size of tensor b (34) at non-singleton dimension 2

i try it with pytorch0.4.1 and 1.0, AND my config is like this:

model:  
    arch: icnet  
data:
    dataset: pascal  
    train_split: train_aug  
    val_split: val  
    img_rows: 'same'  
    img_cols: 'same'  
    path: /home/xy/disk1/pascalvoc/VOCdevkit/VOC2012/  
    sbd_path: /home/xy/disk1/pascalvoc/benchmark_RELEASE/  
training:  
    train_iters: 300000  
    batch_size: 1  
    val_interval: 1000  
    n_workers: 16  
    print_interval: 50  
    optimizer:  
        name: 'sgd'  
        lr: 1.0e-10  
        weight_decay: 0.0005  
        momentum: 0.99  
    loss:  
        name: 'cross_entropy'  
        size_average: False  
    lr_schedule:  
    resume: icnet_pascal_best_model.pkl  

How can I solve this problem? Thank you

Hi,have you solved the problem? I had a same problem.can you help me?

I have the same problem.

hjhjb commented