Accelerating Convergence of Replica Exchange Stochastic Gradient MCMC via Variance Reduction
Wei Deng · Qi Feng · Georgios Karagiannis · Guang Lin · Faming Liang
Keywords:
variance reduction
uncertainty quantification
replica exchange
parallel tempering
stochastic gradient Langevin dynamics
change of measure
generalized Girsanov theorem
Dirichlet form
Markov jump process
2021 Poster
Abstract
Replica exchange stochastic gradient Langevin dynamics (reSGLD) has shown promise in accelerating the convergence in non-convex learning; however, an excessively large correction for avoiding biases from noisy energy estimators has limited the potential of the acceleration. To address this issue, we study the variance reduction for noisy energy estimators, which promotes much more effective swaps. Theoretically, we provide a non-asymptotic analysis on the exponential convergence for the underlying continuous-time Markov jump process; moreover, we consider a generalized Girsanov theorem which includes the change of Poisson measure to overcome the crude discretization based on the Gr\"{o}wall's inequality and yields a much tighter error in the 2-Wasserstein ($\mathcal{W}_2$) distance. Numerically, we conduct extensive experiments and obtain state-of-the-art results in optimization and uncertainty estimates for synthetic experiments and image data.
Video
Chat is not available.
Successful Page Load