Awesome Papers on Video/3D Generation and Representation

Video Generation

2023

PVDM: Video Probabilistic Diffusion Models in Projected Latent Space

[code] (CVPR 2023)

MAGVIT: Masked Generative Video Transformer

[paper][page][code(coming soon)]

MagicVideo: Efficient Video Generation With Latent Diffusion Models

[paper][page]

Phenaki: Variable Length Video Generation From Open Domain Textual Description

[paper] (ICLR 2023)

Make-A-Video: Text-to-Video Generation without Text-Video Data

(ICLR 2023)

StyleFaceV: Face Video Generation via Decomposing and Recomposing Pretrained StyleGAN3

[paper][page][code]

CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers

[paper][page][code] (ICLR 2023)

2022

Generating Long Videos of Dynamic Scenes

[][paper][page][page][code] (NeurIPS 2022)

Video Diffusion Models

[paper][page] (NeurIPS 2022)

MCVD: Masked Conditional Video Diffusion for Prediction, Generation, and Interpolation

[paper][page][code] (NeurIPS 2022)

TATS: Long Video Generation with Time-Agnostic VQGAN and Time-Sensitive Transformer

[paper][code] (ECCV 2022)

CelebV-HQ: A Large-Scale Video Facial Attributes Dataset

[paper][page] (ECCV 2022)

DIGAN: Generating Videos with Dynamics-aware Implicit Generative Adversarial Networks

[paper][code] (ICLR 2022)

MUGEN: A Playground for Video-Audio-Text Multimodal Understanding and GENeration

[paper][page][code] (ECCV 2022)

VideoINR: Learning Video Implicit Neural Representation for Continuous Space-Time Super-Resolution

[paper][code] (CVPR 2022)

Show Me What and Tell Me How: Video Synthesis via Multimodal Conditioning

[paper][page][code] (CVPR 2022)

StyleGAN-V: A Continuous Video Generator with the Price, Image Quality and Perks of StyleGAN2

[paper][page][code] (CVPR 2022)

Video2StyleGAN: Disentangling Local and Global Variations in a Video

[paper]

2021

CCVS: Context-aware Controllable Video Synthesis

[paper] NeurIPS

V3GAN: Decomposing Background, Foreground and Motion for Video Generation

[paper]

Playable Video Generation

[paper][code] (CVPR 2021 Oral)

Stochastic Image-to-Video Synthesis using cINNs

[paper][page][code] (CVPR 2021)

Generative Video Transformer: Can Objects be the Words?

[paper] (ICML 2021)

VideoGPT: Video Generation using VQ-VAE and Transformers

[paper][code]

StyleVideoGAN: A Temporal Generative Model using a Pretrained StyleGAN

[paper]

MoCoGAN-HD: A Good Image Generator Is What You Need for High-Resolution Video Synthesis

[paper][code] (ICLR 2021 Spotlight)

InMoDeGAN: Interpretable Motion Decomposition Generative Adversarial Network for Video Generation

[paper][page]

Temporal Shift GAN for Large Scale Video Generation

[paper] (WACV 2021)

2020

Infinite Nature: Perpetual View Generation of Natural Scenes from a Single Image

[paper][page][code] (ICCV 2021 oral)

Learning Temporal Coherence via Self-Supervision for GAN-based Video Generation

[paper][page][code] (ACM Graphics)

G3AN: Disentangling Appearance and Motion for Video Generation

[paper][code] (CVPR 2020)

ImaGINator: Conditional Spatio-Temporal GAN for Video Generation

[paper] (WACV)

2019

Conditional GAN with Discriminative Filter Generation for Text-to-Video Synthesis

[paper] IJCAI

2018

MoCoGAN: Decomposing Motion and Content for Video Generation

[paper] CVPR

2017

TGAN: Temporal Generative Adversarial Nets with Singular Value Clipping

[paper] ICCV

2016

Generating Videos with Scene Dynamics

[paper] NeurIPS

Conditional

Temporally Consistent Semantic Video Editing

[paper]

Video Representation

2022

Scalable Neural Video Representations with Learnable Positional Features

[paper][page][code] (NeurIPS 2022)

MCL: Motion-Focused Contrastive Learning of Video Representations

[paper] (ICCV 2022 oral)

2021

FAME: Motion-aware Contrastive Video Representation Learning via Foreground-background Merging

[paper] CVPR

TAM: Temporal Adaptive Module for Video Recognition

[paper] ICCV

Self-supervised Video Representation Learning by Context and Motion Decoupling

[paper] CVPR

Enhancing Unsupervised Video Representation Learning by Decoupling the Scene and the Motion

[paper] AAAI

Others

2022

3D-Aware Video Generation

[paper]

Latent Image Animator: Learning to Animate Images via Latent Space Navigation

[paper][code] ICLR

source to target, find latent direction

Stochastic Backpropagation: A Memory Efficient Strategy for Training Video Models

[paper] CVPR

bottom layer has redundancy, randomly drop gradients of spatial model