👋 加入我们的 微信社区
Mini Sora 开源社区定位为由社区同学自发组织的开源社区(免费不收取任何费用、不割韭菜),Mini Sora 计划探索 Sora 的实现路径和后续的发展方向:
- 将定期举办 Sora 的圆桌和社区一起探讨可能性
- 视频生成的现有技术路径探讨
主讲: 邢桢 复旦大学视觉与学习实验室博士生
直播看点:
- 图像生成扩散模型基础
- 文生视频扩散模型的发展
- 浅谈 Sora 背后技术和复现挑战
在线直播时间: 02/28 20:00-21:00
扫描二维码进入微信群和预约直播
- Sora: Creating video from text 技术报告: Video generation models as world simulators
- DiT: Scalable Diffusion Models with Transformers
- Latte: Latte: Latent Diffusion Transformer for Video Generation
- 更新中...
论文 | 链接 |
1) Guided-Diffusion: Diffusion Models Beat GANs on Image Synthesis | Paper, Github |
2) Latent Diffusion: High-Resolution Image Synthesis with Latent Diffusion Models | Paper, Github |
3) EDM: Elucidating the Design Space of Diffusion-Based Generative Models | Paper, Github |
4) DDPM: Denoising Diffusion Probabilistic Models | Paper, Github |
5) DDIM: Denoising Diffusion Implicit Models | Paper, Github |
6) Score-Based Diffusion: Score-Based Generative Modeling through Stochastic Differential Equations | Paper, Github, Blog |
7) Stable Cascade: Würstchen: An efficient architecture for large-scale text-to-image diffusion models | Paper, Github, Blog |
论文 | 链接 |
1) UViT: All are Worth Words: A ViT Backbone for Diffusion Models | Paper, Github, ModelScope |
2) DiT: Scalable Diffusion Models with Transformers | Paper, Github, ModelScope |
3) SiT: Exploring Flow and Diffusion-based Generative Models with Scalable Interpolant Transformers | Paper, Github, ModelScope |
4) FiT: Flexible Vision Transformer for Diffusion Model | Paper, Github |
5) k-diffusion: Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass Diffusion Transformers | Paper, Github |
6) OpenDiT: An Easy, Fast and Memory-Efficent System for DiT Training and Inference | Github |
论文 | 链接 |
1) Animatediff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning | Paper, Github, ModelScope |
2) I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion Models | Paper, Github, ModelScope |
3) Imagen Video: High Definition Video Generation with Diffusion Models | Paper |
4) MoCoGAN: Decomposing Motion and Content for Video Generation | Paper |
5) Adversarial Video Generation on Complex Datasets | Paper |
6) W.A.L.T:Photorealistic Video Generation with Diffusion Models | Paper Project |
7) VideoGPT: Video Generation using VQ-VAE and Transformers | Paper, Github |
8) Video Diffusion Models | Paper, Github, Project |
9) MCVD: Masked Conditional Video Diffusion for Prediction, Generation, and Interpolation | Paper, Github, Project, Blog |
10) VideoPoet: A Large Language Model for Zero-Shot Video Generation | Paper |
11) MAGVIT: Masked Generative Video Transformer | Paper, Github, Project, Colab |
12) EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions | Paper, Github, Project |
论文 | 链接 |
1) World Model on Million-Length Video And Language With RingAttention | Paper, Github |
2) Ring Attention with Blockwise Transformers for Near-Infinite Context | Paper, Github |
3) Extending LLMs' Context Window with 100 Samples | Paper, Github |
4) Efficient Streaming Language Models with Attention Sinks | Paper, Github |
5) The What, Why, and How of Context Length Extension Techniques in Large Language Models – A Detailed Survey | Paper |
论文 | 链接 |
1) ViViT: A Video Vision Transformer | Paper, Github |
2) VideoLDM: Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models | Paper |
3) LVDM: Latent Video Diffusion Models for High-Fidelity Long Video Generation | Paper, Github |
4) LFDM: Conditional Image-to-Video Generation with Latent Flow Diffusion Models | Paper, Github |
5) MotionDirector: Motion Customization of Text-to-Video Diffusion Models | Paper, Github |
资料 | 链接 |
1) Datawhale - AI视频生成学习 | Feishu doc |
2) A Survey on Generative Diffusion Model | Paper, Github |
3) Awesome-Video-Diffusion-Models | Paper, Github |
4) Awesome-Text-To-Video:A Survey on Text-to-Video Generation/Synthesis | Github |
5) video-generation-survey: A reading list of video generation | Github |
6) Awesome-Video-Diffusion | Github |
7) Video Generation Task in Papers With Code | Task |
8) Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models | Paper |