bolianchen/pytorch_depth_from_videos_in_the_wild

AssertionError: must be downscaling

Closed this issue · 2 comments

I generated dataset from videos using below command:

python gen_data.py \
    --dataset_name video \
    --dataset_dir ./dataset/ \
    --save_dir ./processed \
    --img_height 300 \
    --img_width 300 \
    --mask color

And then started the model training

python train.py \
    --data_path ./processed \
    --png \
    --learn_intrinsics \
    --boxify \
    --model_name ./weights/trained

Error:

Training model named:
   ./weights/trained
Models and tensorboard events files are saved to:
   /home/nitin/Desktop/pytorch_depth_from_videos_in_the_wild/models
Training is using:
   cuda:0
/anaconda3/lib/python3.8/site-packages/torchvision/transforms/transforms.py:280: UserWarning: Argument interpolation should be of type InterpolationMode instead of int. Please, use InterpolationMode enum.
  warnings.warn(
Traceback (most recent call last):
  File "train.py", line 61, in <module>
    main_worker(0, world_size, method, gpus, dist_backend, unknown_args1)
  File "train.py", line 25, in main_worker
    trainer = method_zoo[method][1](opts)
  File "/home/nitin/Desktop/pytorch_depth_from_videos_in_the_wild/trainers/wild_trainer.py", line 29, in __init__
    super(WildTrainer, self).__init__(options)
  File "/home/nitin/Desktop/pytorch_depth_from_videos_in_the_wild/trainers/base_trainer.py", line 57, in __init__
    self._init_dataloaders()
  File "/home/nitin/Desktop/pytorch_depth_from_videos_in_the_wild/trainers/base_trainer.py", line 345, in _init_dataloaders
    self.repr_intrinsics = val_dataset.get_repr_intrinsics()
  File "/home/nitin/Desktop/pytorch_depth_from_videos_in_the_wild/datasets/custom_mono_dataset.py", line 134, in get_repr_intrinsics
    _, ratio, delta_u, delta_v, _ = self.get_color(
  File "/home/nitin/Desktop/pytorch_depth_from_videos_in_the_wild/datasets/custom_mono_dataset.py", line 107, in get_color
    return self.get_image(color, do_flip, crop_offset)
  File "/home/nitin/Desktop/pytorch_depth_from_videos_in_the_wild/datasets/custom_mono_dataset.py", line 71, in get_image
    image, ratio, delta_u, delta_v = image_resize(image, resize_h,
  File "/home/nitin/Desktop/pytorch_depth_from_videos_in_the_wild/lib/img_processing.py", line 49, in image_resize
    assert raw_h >= target_h, 'must be downscaling'
AssertionError: must be downscaling

@imneonizer
Thanks for pointing out the issue.
There are two constraints about input size:
(1) input width and height should be multiples of 32
(2) input size of your generated data must larger than that defined by the width and height set in the options

Constraint (1) is because we apply ResNet in the depth encoder network that halves input 5 times. It is easier to maintain the dimension consistency between the depth encoder and the depth decoder. The latter uses the former's each intermediate layers.

About constraint (2), we would further rescale camera intrinsics according how raw input images are scaled to the target size to the networks, which is defined by --width and --height in base_options.py. We determined to only support downscaling to simplify the corresponding functions development.

I just updated the main branch, please pull the latest update and try the following command:

python train.py
--data_path ./processed
--png
--learn_intrinsics
--weighted_ssim \ please add this
--boxify
--model_name ./weights/trained
--width 288
--height 288

Thanks again.

Best,
Bolian

Thank you for the explaination