/DynamiCrafter

DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors

Primary LanguagePythonOtherNOASSERTION

DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors

     
Open in OpenXLab     

Jinbo Xing, Menghan Xia*, Yong Zhang, Haoxin Chen, Wangbo Yu,
Hanyuan Liu, Xintao Wang, Tien-Tsin Wong*, Ying Shan


(* corresponding authors)

From CUHK and Tencent AI Lab.

๐Ÿ”† Introduction

๐Ÿค— DynamiCrafter can animate open-domain still images based on text prompt by leveraging the pre-trained video diffusion priors. Please check our project page and paper for more information.
๐Ÿ˜€ We will continue to improve the model's performance, which includes offering higher resolution, eliminating watermarks, and enhancing stability.

1. Showcases

"bear playing guitar happily, snowing" "boy walking on the street"
"two people dancing" "girl talking and blinking"
"zoom-in, a landscape, springtime" "A blonde woman rides on top of a moving
washing machine into the sunset."
"explode colorful smoke coming out" "a bird on the tree branch"

2. Applications

2.1 Storytelling video generation (see project page for more details)

2.2 Looping video generation

2.3 Generative frame interpolation

Input starting frame Input ending frame Generated video

๐Ÿ“ Changelog

  • [2023.12.02]: ๐Ÿ”ฅ๐Ÿ”ฅ Launch the local Gradio demo.
  • [2023.11.29]: ๐Ÿ”ฅ๐Ÿ”ฅ Release the main model at a resolution of 256x256.
  • [2023.11.27]: ๐Ÿ”ฅ๐Ÿ”ฅ Launch the project page and update the arXiv preprint.

๐Ÿงฐ Models

Model Resolution Checkpoint
DynamiCrafter256 256x256 Hugging Face

It takes approximately 10 seconds and requires a peak GPU memory of 20 GB to animate an image using a single NVIDIA A100 (40G) GPU.

โš™๏ธ Setup

Install Environment via Anaconda (Recommended)

conda create -n dynamicrafter python=3.8.5
conda activate dynamicrafter
pip install -r requirements.txt

๐Ÿ’ซ Inference

1. Command line

  1. Download pretrained models via Hugging Face, and put the model.ckpt in checkpoints/dynamicrafter_256_v1/model.ckpt.
  2. Run the commands based on your devices and needs in terminal.
  # Run on a single GPU:
  sh scripts/run.sh
  # Run on multiple GPUs for parallel inference:
  sh scripts/run_mp.sh

2. Local Gradio demo

  1. Download the pretrained models and put them in the corresponding directory according to the previous guidelines.
  2. Input the following commands in terminal.
  python gradio_app.py

๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ฆ Crafter Family

VideoCrafter1: Framework for high-quality video generation.

ScaleCrafter: Tuning-free method for high-resolution image/video generation.

TaleCrafter: An interactive story visualization tool that supports multiple characters.

LongerCrafter: Tuning-free method for longer high-quality video generation.

MakeYourVideo, might be a Crafter:): Video generation/editing with textual and structural guidance.

StyleCrafter: Stylized-image-guided text-to-image and text-to-video generation.

๐Ÿ˜‰ Citation

@article{xing2023dynamicrafter,
  title={DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors},
  author={Xing, Jinbo and Xia, Menghan and Zhang, Yong and Chen, Haoxin and Yu, Wangbo and Liu, Hanyuan and Wang, Xintao and Wong, Tien-Tsin and Shan, Ying},
  journal={arXiv preprint arXiv:2310.12190},
  year={2023}
}

๐Ÿ™ Acknowledgements

We would like to thank AK(@_akhaliq) for the help of setting up hugging face online demo, and camenduru for providing the replicate & colab online demo.

๐Ÿ“ข Disclaimer

We develop this repository for RESEARCH purposes, so it can only be used for personal/research/non-commercial purposes.