hzxie/Pix2Vox

Training Model on High Resolution Images

Closed this issue · 1 comments

Hi, I am trying to train your model on a high resolution image dataset. My Dataset has images of dimension 512x512, and binvox files are of dimensions 512x512x512. I implemented the dataloader for my dataset and updated the config.py file but at the time of training I am getting the following error :

Traceback (most recent call last):
  File "runner.py", line 93, in <module>
    main()
  File "runner.py", line 74, in main
    train_net(cfg)
  File "/home/imperial-dragon/Workspace/Python/Pix2Vox/core/train.py", line 229, in train_net
    raw_features, generated_volumes = decoder(image_features)
  File "/home/imperial-dragon/Workspace/Python/Pix2Vox/env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/imperial-dragon/Workspace/Python/Pix2Vox/env/lib/python3.8/site-packages/torch/nn/parallel/data_parallel.py", line 159, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "/home/imperial-dragon/Workspace/Python/Pix2Vox/env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/imperial-dragon/Workspace/Python/Pix2Vox/models/decoder.py", line 46, in forward
    gen_volume = features.view(-1, 2048, 2, 2, 2)
RuntimeError: shape '[-1, 2048, 2, 2, 2]' is invalid for input of size 102400

After going through the code of decoder.py
decoder
When creating layers ConvTranspose3d has some hardcoded parameters which I have to change according to my image dimensions to fix this error. Simillar hardcoded parameters are present in encoder, merger and refiner.
How do these parameters vary based on dimensions of images or binvox files ?
Can you guide me on how can I set these paramaters or how can I calculate them for my dataset.

hzxie commented

Please make sure that the input images are of size 224 x 224.