hzwer/ECCV2022-RIFE

question about encode

MichalTurek opened this issue · 3 comments

Hi, could you please describe what exactly is done and what's the output of encode in IFNet.forward()

 f0 = self.encode(img0[:, :3])
  f1 = self.encode(img1[:, :3])

It outputs tensor of size (1,8,width,height) but what exactly this tensor represents?

hzwer commented

My detailed ideas can be seen in https://arxiv.org/abs/2310.17294.
Warping features is better than only warping images, refer to Context-aware synthesis for video frame interpolation
It is an encoder learned by the model independently. I don't know how to explain what exactly it does.

okay, thank you. Another thing, is there any paper which further explain how mask is estimated? I am talking about mask which is used for forward and backward warped images merge into final output?

hzwer commented

You may refer to Superslomo