This seminar will focus on the latest developments in the field of diffusion models, particularly video diffusion models. Topics will include aspects such as temporal and identity consistency, efficiency, and applications specifically in the realm of human avatars.
Resources:
- Diffusion models blogpost (more intuitive approach)
- Blog covering many subjects in Diffusion models - highly recommended
- Tutorail on diffusion models (in-depth and detailed)
Title | Paper / Resource | Year | Why is it interesting? | Asignee | Recording | Slides |
---|---|---|---|---|---|---|
Introduction | read whyReview of key concepts from our previous seminar, followed by a brief overview the new seminar |
Shira Bar On | zoom(^gr9RV*^) | slides | ||
Video Diffusion Model in Latent Space | Video Probabilistic Diffusion Models in Projected Latent Space + Align Your Latents |
2023 | read whyExample of two different approaches for Video Latent Diffusion Models |
Tal Ben Haim | zoom(mQe@N2^0) | slides |
Denoising Diffusion Implicit Models (DDIM) and Distillation | DDIM + Knowledge Distillation in Iterative Generative Models for Improved Sampling Speed |
2020, 2021 | read whyDDIM: paper introducing deterministic sampling, distillation paper lowers the number of sampling steps to one by building on this concept |
Tomer Stolik | zoom(Dha$Ue7&) | slides |
Diffusion Transformer | DiT | 2022 (ICCV 2023) | read whyReplacing the Unet base of the diffusion model with Transformer base, which improves results and the condition control |
Shira Bar On | zoom(?g16iHvN) | slides |
AnimateDiff | AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning | ICLR'24 spotlight | read whyIntroducing the Idea of converting of-the shelf text-to-image diffusion model to text-to-video diffusion model |
Ganit Kupershmidt | zoom(?yjK4Df0) | slides |
Controlling Generative Video Models | Lightricks | read whyGenerating realistic images and videos from text has marked a significant milestone. However, as human beings, our ambitions extend further. We aim not only for visually appealing outcomes but also for enhanced control on the generated outcome. In my talk, I will explore the techniques employed to precisely control model outputs to fulfill our specific requirements at Lightricks |
Neomi Ken Korem | zoom(kG?j66Z2) | slides | |
Animate Anyone | Animate Anyone: Consistent and Controllable Image-to-Video Synthesis for Character Animation | 2023 | read whyVideo diffusion for avatar animation, introducing ReferenceNet for identity preservation |
Matan Feldman | zoom(mK!0A7?e) | slides |
EMO: Emote Portrait Alive | EMO | 2024 | read whyFace Animation using video diffusion models, using one image and audio as inputs |
Alon Mengi | zoom(kA.sh4BY) | slides |
Gen1 (Runway) | Gen1, gen2 blog | 2023 | read whyGen 1, 2, and 3 models are among the known video diffusion models |
Daniel Duenias | zoom(B#638.5R) | slides |
Advanced Topics in Distillation | Progressive Distillation + Consistancy Models |
2022, 2023 | read whytwo approaches for lowering the number of sampling steps |
Amitay Nachmani | zoom(M7gHq+Va) | slides |
Rectified Flow | Rectified Flow + InstaFlow |
2023 | read whyRectified Flow optimizes diffusion models by straightening their transport paths, Instaflow applies this technique |
Ofir Bar Tal | zoom(T%%96K^4) | slides |
Generative Image Dynamics | Generative Image Dynamics | CVPR 2024 Best Paper Award | read whyUsing diffusion models to generate image dynamics, using the Fourier domain |
Roy Hachnochi | zoom(?68E!U%R) | slides |