Exponential rate of convergence of relative value iteration algorithms for ergodic controls of diffusions

arXiv Math

이 뉴스, 어떠셨어요?

한 번의 탭으로 반응을 남겨요 · 로그인 불필요

CC BY

이 매체는 공공·자유 라이선스로 본문을 직접 표시합니다.

Abstract

In this paper, we investigate the rate of convergence of the relative value iteration (RVI) algorithms for diffusions in $\mathbb{R}^d$ under both the conventional ergodic cost (CEC) and ergodic risk-sensitive cost (ERSC) criteria, and under the uniform exponential stability condition.

The existing RVI algorithms for the CEC and ERSC problems solve the associated initial value Hamilton-Jacobi-Bellman type equations whose solutions are shown to converge asymptotically to the corresponding optimal values.

However, the rates of convergence for such algorithms have remained open.

This paper proposes discrete-time implementations for the RVI algorithms based on slight modifications of the associated PDEs, and proves that the rates of convergence of these RVI algorithms are exponential under a weighted sup-norm.

These implementations have discrete-time iterates that can be explicitly expressed as recursive systems.

The difference between these iterates and the desired value function in the CEC case can then be expressed in terms of the associated Markov kernels.

Similarly, this can be done for the logarithms of the corresponding iterates and desired value function in the ERSC case in terms of the associated Markov kernels for the extended diffusion.

As a result, we are able to prove the desirable contraction properties in order to establish the exponential rate of convergence by making use of a weighted semi-norm in which Markov kernel acts a contraction.

전문 보기

Exponential rate of convergence of relative value iteration algorithms for ergodic controls of diffusions

이 뉴스, 어떠셨어요?

Abstract

관련 뉴스

'research' 카테고리 뉴스

What Drives Interactive Improvement from Feedback?

Contrastive Reflection for Iterative Prompt Optimization

How Can AI Find My Model? A Model-Finding Experimental Study Considering Data Formats, Embeddings, and Retrieval Strategies

arXiv의 다른 기사

Beyond expert users: agents should help users construct preferences, not just elicit them

Investigating Multi-Agent Deliberation in Law

Why Solve It Twice? Hierarchical Accumulation of Skills for Transfer-Efficient ML Engineering