Exponential rate of convergence of relative value iteration algorithms for ergodic controls of diffusions
이 뉴스, 어떠셨어요?
한 번의 탭으로 반응을 남겨요 · 로그인 불필요
Abstract
In this paper, we investigate the rate of convergence of the relative value iteration (RVI) algorithms for diffusions in $\mathbb{R}^d$ under both the conventional ergodic cost (CEC) and ergodic risk-sensitive cost (ERSC) criteria, and under the uniform exponential stability condition.
The existing RVI algorithms for the CEC and ERSC problems solve the associated initial value Hamilton-Jacobi-Bellman type equations whose solutions are shown to converge asymptotically to the corresponding optimal values.
However, the rates of convergence for such algorithms have remained open.
This paper proposes discrete-time implementations for the RVI algorithms based on slight modifications of the associated PDEs, and proves that the rates of convergence of these RVI algorithms are exponential under a weighted sup-norm.
These implementations have discrete-time iterates that can be explicitly expressed as recursive systems.
The difference between these iterates and the desired value function in the CEC case can then be expressed in terms of the associated Markov kernels.
Similarly, this can be done for the logarithms of the corresponding iterates and desired value function in the ERSC case in terms of the associated Markov kernels for the extended diffusion.
As a result, we are able to prove the desirable contraction properties in order to establish the exponential rate of convergence by making use of a weighted semi-norm in which Markov kernel acts a contraction.