/Mora

Mora: More like Sora for Generalist Video Generation

Primary LanguagePython

Mora: More like Sora for Generalist Video Generation

๐Ÿ” See our newest Video Generation paper: "Mora: Enabling Generalist Video Generation via A Multi-Agent Framework" Paper GitHub Project)

๐Ÿ“ง Please let us know if you find a mistake or have any suggestions by e-mail: lis221@lehigh.edu

๐Ÿ“ฐ News

๐Ÿš€๏ธ Oct 9: Our Mora update v2 paper and training code will coming soon.

๐Ÿš€๏ธ Jun 13: Our code is released!

๐Ÿš€๏ธ Mar 20: Our paper "Mora: Enabling Generalist Video Generation via A Multi-Agent Framework" is released!

What is Mora

Mora is a multi-agent framework designed to facilitate generalist video generation tasks, leveraging a collaborative approach with multiple visual agents. It aims to replicate and extend the capabilities of OpenAI's Sora. Task

๐Ÿ“น Demo for Artist Creation

Inspired by OpenAI Sora: First Impressions, we utilize Mora to generate Shy kids video. Even though Mora has reached the similar level as Sora in terms of video duration, 80s, Mora still has a significant gap in terms of resolution, object consistency, motion smoothness, etc.

demo_shy_kids.mp4

๐ŸŽฅ Demo (1024ร—576 resolution, 12 seconds and more!)

Mora: A Multi-Agent Framework for Video Generation

test image

  • Multi-Agent Collaboration: Utilizes several advanced visual AI agents, each specializing in different aspects of the video generation process, to achieve high-quality outcomes across various tasks.
  • Broad Spectrum of Tasks: Capable of performing text-to-video generation, text-conditional image-to-video generation, extending generated videos, video-to-video editing, connecting videos, and simulating digital worlds, thereby covering an extensive range of video generation applications.
  • Open-Source and Extendable: Moraโ€™s open-source nature fosters innovation and collaboration within the community, allowing for continuous improvement and customization.
  • Proven Performance: Experimental results demonstrate Mora's ability to achieve performance that is close to that of Sora in various tasks, making it a compelling open-source alternative for the video generation domain.

Results

Text-to-video generation

Input prompt Output video
A vibrant coral reef teeming with life under the crystal-clear blue ocean, with colorful fish swimming among the coral, rays of sunlight filtering through the water, and a gentle current moving the sea plants.
A majestic mountain range covered in snow, with the peaks touching the clouds and a crystal-clear lake at its base, reflecting the mountains and the sky, creating a breathtaking natural mirror.
In the middle of a vast desert, a golden desert city appears on the horizon, its architecture a blend of ancient Egyptian and futuristic elements.The city is surrounded by a radiant energy barrier, while in the air, seve

Text-conditional image-to-video generation

Input prompt Input image Mora generated Video Sora generated Video
Monster Illustration in the flat design style of a diverse family of monsters. The group includes a furry brown monster, a sleek black monster with antennas, a spotted green monster, and a tiny polka-dotted monster, all interacting in a playful environment.
An image of a realistic cloud that spells โ€œSORAโ€.

Extend generated video

Original video Mora extended video Sora extended video

Video-to-video editing

Instruction Original video Mora edited Video Sora edited Video
Change the setting to the 1920s with an old school car. make sure to keep the red color.
Put the video in space with a rainbow road

Connect videos

Input previous video Input next video Output connect Video

Simulate digital worlds

Mora simulating video Sora simulating video

Getting Started

Code will be released as soon as possible!

Citation

@article{yuan2024mora,
  title={Mora: Enabling Generalist Video Generation via A Multi-Agent Framework},
  author={Yuan, Zhengqing and Chen, Ruoxi and Li, Zhaoxu and Jia, Haolong and He, Lifang and Wang, Chi and Sun, Lichao},
  journal={arXiv preprint arXiv:2403.13248},
  year={2024}
}
@article{liu2024sora,
  title={Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models},
  author={Liu, Yixin and Zhang, Kai and Li, Yuan and Yan, Zhiling and Gao, Chujie and Chen, Ruoxi and Yuan, Zhengqing and Huang, Yue and Sun, Hanchi and Gao, Jianfeng and others},
  journal={arXiv preprint arXiv:2402.17177},
  year={2024}
}
@misc{openai2024sorareport,
  title={Video generation models as world simulators},
  author={OpenAI},
  year={2024},
  howpublished={https://openai.com/research/video-generation-models-as-world-simulators},
}