A Multiscale Perspective on Maximum Marginal Likelihood Estimation
이 뉴스, 어떠셨어요?
한 번의 탭으로 반응을 남겨요 · 로그인 불필요
Abstract
In this paper, we provide a multiscale perspective on the problem of maximum marginal likelihood estimation.
We consider and analyse a diffusion-based maximum marginal likelihood estimation scheme using ideas from multiscale dynamics.
Our perspective is based on stochastic averaging; we make an explicit connection between ideas in applied probability and parameter inference in computational statistics.
In particular, we consider a general class of coupled Langevin diffusions for joint inference of latent variables and parameters in statistical models, where the latent variables are sampled from a fast Langevin process (which acts as a sampler), and the parameters are updated using a slow Langevin process (which acts as an optimiser).
We show that the resulting system of stochastic differential equations (SDEs) can be viewed as a two-time scale system.
To demonstrate the utility of such a perspective, we show that the \textit{averaged} parameter dynamics obtained in the limit of scale separation can be used to estimate the optimal parameter, within the strongly convex setting.
We do this by using recent uniform-in-time non-asymptotic averaging bounds.
Finally, we conclude by showing that the slow-fast algorithm we consider here, termed Slow-Fast Langevin Algorithm, performs on par with state-of-the-art methods on a variety of examples.
We believe that the stochastic averaging approach we provide in this paper enables us to look at these algorithms from a fresh angle, as well as unlocking the path to develop and analyse new methods using well-established averaging principles.