leonzfa/iResNet

displacement range in correlation and EPE computation

stalin18 opened this issue · 3 comments

Hi, in your paper you wrote that "Correlation with a large displacement (i.e., 40) is performed between conv2a and conv2b" and "correlation with a small displacement (i.e., 20) is performed to capture fine-grained but short range correspondence". However, in Table 1 (detailed architecture), output of corr1d has 81 channels, while output of r_corr has 41 channels. Is there a mistake, do you use 80 and 40 displacements instead of 40 and 20?

Lastly, I just wanted to confirm this: the EPE reported in Table 2 (Comparative results on the Scene Flow dataset for networks with different settings) is calculated on entire Scene Flow test dataset without discarding any images / pixels? Only in Table 3 (comparison with CRL), you discard some test images following the same procedure as in CRL paper?

Thank you very much for the code and all the help!

@stalin18 Hi,we use two-direction correlation in corr1d and r_corr layer in the paper, so the channel number is 402+1=81, and 202+1=41 respectively.

layer {  name: "corr"  type: "Correlation1D"  bottom: "conv2a"  bottom: "conv2b"  top: "corr"
  correlation_param {    pad: 40    kernel_size: 1    max_displacement: 40    stride_1: 1    stride_2: 1  }
}

In fact, there is no need to use two-direction in the Correlation1d layer, you can set the parameter single_direction = -1

layer {  name: "corr"  type: "Correlation1D"  bottom: "conv2a"  bottom: "conv2b"  top: "corr"
  correlation_param {    pad: 40    kernel_size: 1    max_displacement: 40    single_direction: -1   stride_1: 1    stride_2: 1  }
}

Yes, in Table 2, we calculated EPE on entire Scene Flow test set (all the 4370 image pairs). CRL did not report their results on the entire test set, thus we remove some images just as CRL did for fair comparison.

Hello,
I have a problom.In FlowNet:Learning Optical Flow with Convolutional Networks. The size of the feature map obtained after the input layer passes through the correlation layers is (wxhxD**2), D=2d+1.In your paper, 'Correlation with a large displacement (i.e., 40)' and ' Correlation with a small displacement (i.e., 20) ',D=2x40+1=81 and D=2x20+1=41,result the size of the feature map is wxhx81x81 and wxhx41x41 ,channel is not 81 and 41.If the output channel is 81, D is equal to 9, and d is equal to 4, but when the channel is 41, D and d cannot be counted.

Thank you very much!!