Jing Dong

Abstract

Gradient descent is known to converge quickly for convex objective functions, but it can be trapped at local minimums. On the other hand, Langevin dynamic can explore the state space and find global minimums, but in order to give accurate estimates, it needs to run with small discretization step size and weak stochastic force, which in general slows down its convergence. This work shows that these two algorithms can “collaborate” through a simple exchange mechanism, in which they swap their current positions if Langevin dynamic yields a lower objective function. This idea can be seen as the singular limit of the replica-exchange technique from the sampling literature. We show that this new algorithm converges to the global minimum linearly with high probability, assuming the objective function is strongly convex in a neighborhood of the unique global minimum. By replacing gradients with stochastic gradients, and adding a proper threshold to the exchange mechanism, our algorithm can also be used in online settings. This is joint work with Xin Tong at National University of Singapore.