FADNet tensor size mismatched
Closed this issue · 1 comments
Hi!
I attempted to validate the ETH3D dataset on FADNet using the pretrained weight: FADNet_sceneflow.pt. However, I encountered a dimension mismatch issue due to the image size of ETH3D being (513, 888). The problem arises specifically in the line concat5 = torch.cat((upconv5, upflow6, conv5b), 1), where the sizes of the three elements are as follows:
upconv5: torch.Size([1, 512, 18, 28])
upflow6: torch.Size([1, 1, 18, 28])
conv5b: torch.Size([1, 512, 17, 28])
Is additional padding or resizing required here or do you have any suggestions on how to address this, for those who would like to inference any other dataset instead of SceneFlow (like ETH3D, Kitti2012, etc). Thanks!
For ETH3D, you can use following transform
setting:
val:
- type: StereoPad
size: [ 512, 960 ]
- type: GetValidDispa
max_disp: 192
- type: TransposeImage
- type: ToTensor
- type: NormalizeImage
mean: [ 0.485, 0.456, 0.406 ]
std: [ 0.229, 0.224, 0.225 ]
For most models, the input image size needs to be a multiple of 64 or 32. So generally speaking, using StereoPad and setting a reasonable size will do the trick.