This repository is the official implementation of Text2Video-Zero.
Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Generators
Levon Khachatryan,
Andranik Movsisyan,
Vahram Tadevosyan,
Roberto Henschel,
Zhangyang Wang, Shant Navasardyan, Humphrey Shi
Our method Text2Video-Zero enables zero-shot video generation using (i) a textual prompt (see rows 1, 2), (ii) a prompt combined with guidance from poses or edges (see lower right), and (iii) Video Instruct-Pix2Pix, i.e., instruction-guided video editing (see lower left).
Results are temporally consistent and follow closely the guidance and textual prompts.
- [03/23/2023] Paper Text2Video-Zero released!
- [03/25/2023] The first version of our huggingface demo (zero-shot text-to-video generation, Video Instruct Pix2Pix) released!
Will be released soon!
"A bear dancing on the concrete" | "An alien dancing under a flying saucer | "A panda dancing in Antarctica" | "An astronaut dancing in the outer space" |
"White butterfly" | "Beautiful girl | "A jellyfish" | "beautiful girl halloween style" |
"Wild fox is walking" | "Oil painting of a beautiful girl close-up | "A santa claus" | "A deer" |
"anime style" | "arcane style | "gta-5 man" | "avar style" |
"Replace man with chimpanze" | "Make it Van Gogh Starry Night style" | "Make it Picasso style" |
"Make it Expressionism style" | "Make it night" | "Make it autumn" |
If you use our work in your research, please cite our publication:
@article{text2video-zero,
title={Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Generators},
author={Khachatryan, Levon and Movsisyan, Andranik and Tadevosyan, Vahram and Henschel, Roberto and Wang, Zhangyang and Navasardyan, Shant and Shi, Humphrey},
journal={arXiv preprint arXiv:2303.13439},
year={2023}
}