ubc-vision/COTR

question about the code detail

Closed this issue · 1 comments

thanks for your great code and paper~ i am new to this area and have 2 questions about the code
1.in cotr-patch-flow-exhaustive:
what is the meaning of cycle_grid? why is the norm result of cycle_grid and in_grid be the confidence?
2.what does the function merge-flow-patches do with the correspondence?
looking forward to your reply~

  1. cycle grid:
    Notice that the out_grid has two correspondence map, left half of it is the correspondence map from left to right, and the right half is from right to left.
    image
    Considering a perfect correspondence map, then the value of the right red point is the coordinates of the left red point, and the value of the right red point is the coordinate of the left red point.
    Then if we use out_grid to sample out_grid itself, the value of the left red point should be the value of the right red point, which is the coordinate of the left red point itself.
    Therefore, the sampled output, i.e. cycle grid should be the default mesh grid, which should be the same as the in_grid.
    And because our network is not perfect, it will make error in the prediction, so we use the norm as the confidence values.

  2. merge_patch_flow
    To process 2 images with different aspect ratio, we can either simply resacle both to square, or use a tiling strategy. merge_patch_flow is used for the tiling strategy: we first cut the input images into multiple square patches(see to_square_patches), then we do an exhaustive correspondence prediction between all pairs, for example, if each image produces 2 patches, we run the correspondence prediction 4 times. Here comes a question that how do we merge the 4 correspondence maps?
    We merge all 4 correspondence maps based on the confidence values, note in our case, the value is the lower the better.