Contour Stochastic Gradient Langevin Dynamics

Simulations of multi-modal distributions can be very costly and often lead to unreliable predictions. To accelerate the computations, we propose to sample from a flattened distribution to accelerate the computations and estimate the importance weights between the original distribution and the flattened distribution to ensure the correctness of the distribution.

We refer interested readers to blog here. For Chinese readers, you may also find this blog interesting 知乎.

Methods	Speed	Special features	Cost
SGLD (ICML'11)	Extremely slow	None	None
Cycic SGLD (ICLR'20)	Medium	Cyclic learning rates	More cycles
Replica exchange SGLD (ICML'20)	Fast	Swaps/Jumps	Parallel chains
Contour SGLD (NeurIPS'20)	Fast	Bouncy moves	Latent vector

The following is a demo to show how the latent vector is gradually estimated

Although this version of CSGLD has a global statbility condition, it doesn't handle high-loss problems appropriately. Please wait for the acceptance of a follow-up paper (submitted) that solves the scalability problem for importance sampling.

@inproceedings{CSGLD,
  title={A Contour Stochastic Gradient Langevin Dynamics Algorithm for Simulations of Multi-modal Distributions},
  author={Wei Deng and Guang Lin and Faming Liang},
  booktitle={Advances in Neural Information Processing Systems},
  year={2020}
}

References:

Max Welling, Yee Whye Teh. Bayesian Learning via Stochastic Gradient Langevin Dynamics. ICML'11
R. Zhang, C. Li, J. Zhang, C. Chen, A. Wilson. Cyclical Stochastic Gradient MCMC for Bayesian Deep Learning. ICLR'20
W. Deng, Q. Feng, L. Gao, F. Liang, G. Lin. Non-convex Learning via Replica Exchange Stochastic Gradient MCMC. ICML'20.
W. Deng, G. Lin, F. Liang. A Contour Stochastic Gradient Langevin Dynamics Algorithm for Simulations of Multi-modal Distributions. NeurIPS'20.

karimul/Contour-Stochastic-Gradient-Langevin-Dynamics

Contour Stochastic Gradient Langevin Dynamics

References: