inference_helper.py

Hello, what function does the cotr_patch_flow_exhaustive function in the inference_helper.py file implement? What are the meanings of p_i and p_j?

cotr_patch_flow_exhaustive is designed for estimate the optical flow/dense correspondence between 2 non-square images at low resolution.
Currently, COTR only supports 256x256 as input, therefore, in order to estimate the dense correspondence between a portrait image and a landscape image, we first cut the images into 2 overlapped square patches. Then we estimate the flow 4 times, i.e. each patch in image A gets correspondence estimated against the 2 patches from image B. Then we merge 4 flows to get a final optical flow.

cotr_patch_flow_exhaustive is designed to estimate the optical flow/dense correspondence between 2 non-square images at low resolution. Currently, COTR only supports 256x256 as input, therefore, to estimate the dense correspondence between a portrait image and a landscape image, we first cut the images into 2 overlapped square patches. Then we estimate the flow 4 times, i.e. each patch in image A gets correspondence estimated against the 2 patches from image B. Then we merge 4 flows to get a final optical flow.

I want to know where you resampled the image from the original size to 256*256, I don't think I can find the exact place to do this.

You can search for the keyword "MAX_SIZE".
For example,

COTR/COTR/inference/refinement_task.py

Lines 117 to 118 in 5c9363f

    
           img_from = np.array(PIL.Image.fromarray(img_from).resize((MAX_SIZE, MAX_SIZE), resample=PIL.Image.BILINEAR)) 
        
           img_to = np.array(PIL.Image.fromarray(img_to).resize((MAX_SIZE, MAX_SIZE), resample=PIL.Image.BILINEAR))

Thank you for your answer. Now I want to improve the speed of matching. I thought changing the image size could improve the speed, but there seems to be no obvious effect. I would like to ask if you have tested the number of parameters on the model to calculate the number of parameters per module.

	img_from = np.array(PIL.Image.fromarray(img_from).resize((MAX_SIZE, MAX_SIZE), resample=PIL.Image.BILINEAR))
	img_to = np.array(PIL.Image.fromarray(img_to).resize((MAX_SIZE, MAX_SIZE), resample=PIL.Image.BILINEAR))