demo video
qlawliet opened this issue · 2 comments
Hi, you've done a very impressive job. Thank you for the open source code.
One of the things I'm curious about is that in the demo video you showed the alignment of frame T and frame T-1,T+1, and I am wondering how it is visualized. I look forward to hearing from you
Thanks for your interest in our work. Previously, I printed the predict offsets and visualized the receptive fields accordingly. Read the model.py L2O2-206, we can know that sampling positions are decided by the 3x3 kernels and two predicted offsets which are used to sample support frames.
For one pixel in the predicted LR frame, it is corresponding to 9 pixels in aligned_fea. Then, for a pixel in aligned_fea, it is corresponding to 9 pixels (actual number should be 9^8 since we use the dconv grounp=8. In practice, a lot of pixels might be the same and we just use the most far away pixels for each axis) in fea. Furthermore, we can find associated positions in supp. In this way, we can visualize all 9^3 positions. But note the actual receptive field with propagated to image-level should be even larger since we have feature extraction module before alignment.