Papers on generative modeling

GenForce: may generative force be with you. Refer to this page for our latest work.

Latent code to Image (StyleGAN and BigGAN types)

NeurIPS2020: Instance Selection for GANs.paper
comment: Use likelihood function on image samples to select instance based on manifold density. So sparse regions of the data manifold can be removed for the GANs to represent.

NeurIPS2020: Top-k Training of GANs: Improving GAN Performance by Throwing Away Bad Samples. paper
comment: One line code modification to use top-k update. Discriminator is used as critic to sort the samples and conduct surgery on the gradients.

CVPR2020: Interpreting the Latent Space of GANs for Semantic Face Editing. paper
comment: use facial classifiers to discover the interpretable dimensions emerged in the GANs trained to synthesize faces.

CVPR2020: Image Processing Using Multi-Code GAN Prior. paper
comment: use the pretrained GAN model as a representation prior to facilitate a series of image processing tasks, such as colorization, super-resolution, denoising.

ECCV2020: In-Domain GAN Inversion for Real Image Editing. paper
comment: Inverted code from Image2StyleGAN does not have enough manipulatability. This work proposes to use an encoder to regularize the optimization to preserve the manipulatability. A novel semantic difusion application is also proposed.

arXiv: Generative Hierarchical Features from Synthesizing Images. paper
comment: It considers the pretrained StyleGAN model as a learned loss (similar to perceptual loss using learned VGG) to train a hierarchical encoder. This work calls to explore various applications of the encoder for both for discriminative tasks and generative tasks. It also echoes my opinion that ImageNet classification is NOT the only way to evaluate the merits of learned features from self-supervised learning. There are so many visual tasks out there, why just stick to dog classification on ImageNet?? (fun fact: there are about 150 dog classes out of the 1000 ImageNet classes).

SIGGRAPH20 Asia: StyleFlow: Attribute-conditioned Exploration of StyleGAN-Generated Images using Conditional Continuous Normalizing Flows. paper code
comment: embed a normalizing flow model in the latent space to support attribute-conditioned sampling and editing. The disentangled manipulation result is great.

ECCV2020: StyleGAN2 Distillation for Feed-forward Image Manipulation. paper, code?
comment: use InterfaceGAN pipeline to generate paired data from pretrained styleGAN, then use the paired data to train a pix2pix network.

ECCV2020: Rewriting a Deep Generative Model. paper
comment: interactively replace the unit semantic patterns. David is the guru of interface, always.

ECCV2020: Exploiting Deep Generative Prior for Versatile Image Restoration and Manipulation. paper
comment: similar to the following SIGGRAPH'19 work, it develops an inversion method for finetuing the pretrained GAN weight to invert an image, then applies to a series of image processing tasks (similar to mGAN prior).

ECCV2020: DeepLandscape: Adversarial Modeling of Landscape Videos. paper page
comment: Train a styleGAN on time-lapse video and then animate landscape image.

SIGGRAPH'19: Semantic Photo Manipulation with a Generative Image Prior. paper
comment: it considers the GAN model as a image prior, then fine-tunes the weights of the pretrained network to invert a given image, finally use the unit semantics discovered by the following GAN dissection to manipulate the image content.

ICLR'19: GAN Dissection: Visualizing and Understanding Generative Adversarial Networks. paper
comment: one of the earliest works looked into the interpretability of generative models PG-GAN.

Disentanglement of Variation Factors in Generative Models

arXiv: Closed-Form Factorization of Latent Semantics in GANs. paper
comment: Unsupervised discovery of the interpretable dimensions in learned GAN models. The algorithm works blazingly fast with only 1 second!

ECCV'20: The Hessian Penalty: A Weak Prior for Unsupervised Disentanglement. paper page
comment: use a simple hessian panality in the GAN training to encourage the disentanglement of features. The idea is to encourage diagnoizing the hessian matrix of the generator G.

Image to Image (Pix2pix and cycleGAN types)

ECCV2020: Contrastive Learning for Unpaired Image-to-Image Translation. paper
comment: Use local contrastic loss to improve quality of pix2pix architecture.

ECCV2020: Learning to Factorize and Relight a City. paper
comment: use pix2pix arch + code swap to factorize the intrinsic images of street-view scenes. It looks similar to the following work.

arXiv: Swapping Autoencoder for Deep Image Manipulation. paper
comment: use code swapping to disentangle textures and contents.

a recent survey dated Aug 2020: Generative Adversarial Networks for Image and Video Synthesis: Algorithms and Applications. paper

Generative models for 3D

NeurIPS2020: Generative 3D Part Assembly via Dynamic Graph Learning. paper code
comment: iterative graph neural network for reasoning the dynamic relations and assembling 3D furnitures.

NeurIPS2020: BlockGAN: Learning 3D Object-aware Scene Representations from Unlabelled Images. paper page

ECCV2020: Learning Gradient Fields for Shape Generation. paper page
comment: 3D point cloud generation by a mixture of Gaussian model.

ICCV2019: PointFlow : 3D Point Cloud Generation with Continuous Normalizing Flows. paper page
comment: invertible Normalizing flow model for 3D point generation.

SIGGRAPH ASIA 2020: Scene Mover: Automatic Move Planning for Scene Arrangement by Deep Reinforcement Learning. paper page

Generative models for drug discovery

Nature MI: Generative molecular design in low data regimes. paper code

Nature MI: Direct steering of de novo molecular generation with descriptor conditional recurrent neural networks. paper

Medium: Creating Molecules from Scratch I: Drug Discovery with Generative Adversarial Networks link

Generative models for animation and locomotion

SIGGRAPH20: CARL: Controllable Agent with Reinforcement Learning for Quadruped Locomotion. paper page
comment: a physics-based controller is composed of three stages. 1)imitation learning for low-level control that specifies the agent's movement at the joint level. 2)GAN control adapter to approximate the natural action distribution in the high-level user control. 3)DRL finetuning to improve the controller's ability to adapt to unseen scenarios. High-level user controls over speed and heading. The idea of using high-level GAN controller in second stage to mimic the behavior of the low-level trained controller, achieved by a GAN loss, is clever. There are the paired labels c_high and c_low. Controllable agent is the way to go!

SIGGRAPH20: Local Motion Phases for Learning Multi-Contact Character Movements. paper page
comment: A generative control model is introduced to produce a variation of realistic movements from the coarse user control signal, which is an encoder-decoder structure + GAN loss.

SIGGRAPH20: Character Controllers using Motion VAEs. paper page
comment: A clear two-stage framework to train character controllers. 1) Train a motion synthesis VAE model: given p_t-1 and p_t for encoder, decoder output p_t. 2) Throw away encoder, train a controller which outputs latent code to the decoder.

SIGGRAPH20: Unpaired Motion Style Transfer from Video to Animation.paper page
comment: Disentangling content code and style code for motion style transfer.

AI4Animation: AI4Animation: Deep Learning, Character Animation, Control.link.
comment: several relevant SIGGRAPH papers and code and locomotion data.

Simulator generation

ICLR19: Learning to simulate. paper
comment: use policy gradient to optimize the simulation parameters.

ICCV19: Meta-Sim: Learning to Generate Synthetic Datasets. paper page
comment: Learn a generative model of synthetic scenes with a graphics engine. The content distribution can be matched. Down-stream tasks can be integrated and jointly optimized.

Generative models for structured data

House-GAN++: Generative Adversarial Layout Refinement Networks. paper

SceneGen: Learning to Generate Realistic Traffic Scenes.. paper video
comment: learn to generate HD maps of traffic scenes.

Relevant Researchers (random order)

Craig Yu: faculty at GMU. on graphics

Junyan Zhu: Adobe researcher and faculty at CMU. on vision + graphics

Tero Karras: nvidia research, leading author of stylegan, as Kaiming He in the field of generative modeling. oh man, i love this guy's work.

Ming-Yu Liu: nvidia researcher on computer vision

David Bau: phd at mit, my collaborator. David is a guru of graphic interface! on model understanding and interpretability.

Alexei Efros: faculty at berkeley. without doubt, the pixel god-father!

Aaron Hertzman: Principal scientist at Adobe. without doubt, the pioneer in image generation.

zhoubolei/awesome-generative-modeling