InterpAny-Clearer

🚀 Clearer Frames, Anytime: Resolving Velocity Ambiguity in Video Frame Interpolation

by Zhihang Zhong^1,*, Gurunandan Krishnan², Xiao Sun¹, Yu Qiao¹, Sizhuo Ma^2,†, and Jian Wang^2,†

¹Shanghai AI Laboratory, OpenGVLab, ²Snap Inc., ^*First author, ^†Co-corresponding authors

We strongly recommend referring to the project page and interactive demo for a better understanding:

👉 project page
👉 interactive demo
👉 arXiv
👉 slides

Please leave a ⭐ if you like this project! 🔥 🔥 🔥

TL;DR:

We addressed velocity ambiguity in video frame interpolation through innovative distance indexing and iterative reference-based estimation strategies, resulting in:
Clearer anytime frame interpolation & Manipulated interpolation of anything

Time indexing vs. Distance indexing

[T] RIFE	[D,R] RIFE (Ours)

Preparation

Conda environment installation:

conda create -n InterpAny python=3.8
conda activate InterpAny
pip install torch==1.12.1+cu116 torchvision==0.13.1+cu116 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu116
pip install -r requirements.txt

Download checkpoints

Download checkpoints from here.

P.S., RIFE-pro denotes the RIFE model trained with more data and epochs

Alternative: Docker

You can build a docker image with all the dependencies installed. See docker/README.md for more details.

Inference

Two images

python inference_img.py --img0 [IMG_0] --img1 [IMG_1] --output_dir [OUTPUT_DIR] --model_name [MODEL_NAME] --variant [VARIANT] --gif

Examples:

python inference_img.py --img0 ./demo/I0_0.png --img1 ./demo/I0_1.png --model RIFE --variant DR --checkpoint ./checkpoints/RIFE/DR-RIFE-pro --save_dir ./results/I0_results_DR-RIFE-pro --num 1 1 1 1 1 1 1 --gif

python inference_img.py --img0 ./demo/I0_0.png --img1 ./demo/I0_1.png --model EMA-VFI --variant DR --checkpoint ./checkpoints/EMA-VFI/DR-EMA-VFI --save_dir ./results/I0_results_DR-EMA-VFI/ --num 1 1 1 1 1 1 1 --gif

Video

python inference_video.py --video [VIDEO] --output_dir [OUTPUT_DIR] --model_name [MODEL_NAME] --variant [VARIANT]

Examples:

python inference_video.py --video ./demo/demo.mp4 --model RIFE --variant DR --checkpoint ./checkpoints/RIFE/DR-RIFE-pro --save_dir ./results/demo_results_DR-RIFE-pro --num 3

Manipulation

Manipulated interpolation of anything

Demos

Additional installation

Follow ./webapp/backend/README.md to setup the environment for Segment Anything.
Follow ./webapp/webapp/README.md to setup the environment for the webapp.

Run the app

cd ./webapp/backend/
python app.py

# open a new terminal
cd ./webapp/webapp/
yarn && yarn start

Dataset

You can download the splited Vimeo90K dataset with our distance indexing maps from here ( or full dataet), and then merge them:

cat vimeo_septuplet_split.zipa* > vimeo_septuplet_split.zip

Alternatively, you can download original Vimeo90K dataset from here, and then generate distance indexing (P.S. Download checkpoints for RAFT and put them under ./RAFT/models/ in advance):

python multiprocess_create_dis_index.py

Train

Training command:

python train.py --model [MODEL_NAME] --variant [VARIANT]

Examples:

python train.py --model RIFE --variant D

python train.py --model RIFE --variant DR

python train.py --model AMT-S --variant D

python train.py --model AMT-S --variant DR

Test

Testing with precomputed distance maps:

python test.py --model [MODEL_NAME] --variant [VARIANT]

Examples:

python test.py --model RIFE --variant D

python test.py --model RIFE --variant DR

Testing using uniform distance maps with the same inputs as the time indexes:

python test.py --model [MODEL_NAME] --variant [VARIANT] --uniform

Examples:

python test.py --model RIFE --variant D --uniform

python test.py --model RIFE --variant DR --uniform

Citation

If you find this repository useful, please consider citing:

@article{zhong2023clearer,
  title={Clearer Frames, Anytime: Resolving Velocity Ambiguity in Video Frame Interpolation},
  author={Zhong, Zhihang and Krishnan, Gurunandan and Sun, Xiao and Qiao, Yu and Ma, Sizhuo and Wang, Jian},
  journal={arXiv preprint arXiv:2311.08007},
  year={2023}
}

Acknowledgements

We thank Dorian Chan, Zhirong Wu, and Stephen Lin for their insightful feedback and advice. Our thanks also go to Vu An Tran for developing the web application, and to Wei Wang for coordinating the user study.

Moreover, we appreciate the following projects for releasing their code:

[ECCV 2020] RAFT: Recurrent All Pairs Field Transforms for Optical Flow
[ECCV 2022] Real-Time Intermediate Flow Estimation for Video Frame Interpolation
[CVPR 2022] IFRNet: Intermediate Feature Refine Network for Efficient Frame Interpolation
[CVPR 2023] AMT: All-Pairs Multi-Field Transforms for Efficient Frame Interpolation
[CVPR 2023] Extracting Motion and Appearance via Inter-Frame Attention for Efficient Video Frame Interpolation
[ICCV 2023] Segment Anything

Wei-ucas/InterpAny-Clearer

InterpAny-Clearer

🚀 Clearer Frames, Anytime: Resolving Velocity Ambiguity in Video Frame Interpolation

TL;DR:

Time indexing vs. Distance indexing

Preparation

Conda environment installation:

Download checkpoints

Alternative: Docker

Inference

Two images

Video

Manipulation

Manipulated interpolation of anything

Demos

Additional installation

Run the app

Dataset

Train

Test

Citation

Acknowledgements