Contour Stochastic Gradient Langevin Dynamics
Simulations of multi-modal distributions can be very costly and often lead to unreliable predictions. To accelerate the computations, we propose to sample from a flattened distribution to accelerate the computations and estimate the importance weights between the original distribution and the flattened distribution to ensure the correctness of the distribution.
We refer interested readers to blog here. For Chinese readers, you may also find this blog interesting 知乎.
Methods | Speed | Special features | Cost |
---|---|---|---|
SGLD (ICML'11) | Extremely slow | None | None |
Cycic SGLD (ICLR'20) | Medium | Cyclic learning rates | More cycles |
Replica exchange SGLD (ICML'20) | Fast | Swaps/Jumps | Parallel chains |
Contour SGLD (NeurIPS'20) | Fast | Bouncy moves | Latent vector |
The following is a demo to show how the latent vector is gradually estimated
Although this version of CSGLD has a global statbility condition, it doesn't handle high-loss problems appropriately. Please wait for the acceptance of a follow-up paper (submitted) that solves the scalability problem for importance sampling.
@inproceedings{CSGLD,
title={A Contour Stochastic Gradient Langevin Dynamics Algorithm for Simulations of Multi-modal Distributions},
author={Wei Deng and Guang Lin and Faming Liang},
booktitle={Advances in Neural Information Processing Systems},
year={2020}
}
References:
-
Max Welling, Yee Whye Teh. Bayesian Learning via Stochastic Gradient Langevin Dynamics. ICML'11
-
R. Zhang, C. Li, J. Zhang, C. Chen, A. Wilson. Cyclical Stochastic Gradient MCMC for Bayesian Deep Learning. ICLR'20
-
W. Deng, Q. Feng, L. Gao, F. Liang, G. Lin. Non-convex Learning via Replica Exchange Stochastic Gradient MCMC. ICML'20.
-
W. Deng, G. Lin, F. Liang. A Contour Stochastic Gradient Langevin Dynamics Algorithm for Simulations of Multi-modal Distributions. NeurIPS'20.