GongyeLiu, Menghan Xia*, Yong Zhang, Haoxin Chen, Jinbo Xing,
Xintao Wang, Yujiu Yang*, Ying Shan
(* corresponding authors)
From Tsinghua University and Tencent AI Lab.
TL;DR: We propose StyleCrafter, a generic method that enhances pre-trained T2V models with style control, supporting Style-Guided Text-to-Image Generation and Style-Guided Text-to-Video Generation.
- [2023.12.08]: ๐ฅ๐ฅ Release the Huggingface online demo.
- [2023.12.05]: ๐ฅ๐ฅ Release the code and checkpoint.
- [2023.11.30]: ๐ฅ๐ฅ Release the project page.
- Remove Video Watermark(due to trained on WebVid10M).
Model | Resolution | Checkpoint |
---|---|---|
StyleCrafter | 320x512 | Hugging Face |
It takes approximately 5 seconds to generate a 512ร512 image and 85 seconds to generate a 320ร512 video with 16 frames using a single NVIDIA A100 (40G) GPU. A GPU with at least 16G GPU memory is required to perform the inference process.
- Run Install_cn.ps1 in PowerShell
- Run run_gui.ps1 in PowerShell
VideoCrafter1: Framework for high-quality text-to-video generation.
ScaleCrafter: Tuning-free method for high-resolution image/video generation.
TaleCrafter: An interactive story visualization tool that supports multiple characters.
LongerCrafter: Tuning-free method for longer high-quality video generation.
DynamiCrafter Animate open-domain still images to high-quality videos.
We develop this repository for RESEARCH purposes, so it can only be used for personal/research/non-commercial purposes.
We would like to thank AK(@_akhaliq) for the help of setting up online demo.
If your have any comments or questions, feel free to contact lgy22@mails.tsinghua.edu.cn