A Data-Enabled Primal-Dual Approach for Policy Learning with SDP Formulations

arXiv Math

이 뉴스, 어떠셨어요?

한 번의 탭으로 반응을 남겨요 · 로그인 불필요

CC BY

이 매체는 공공·자유 라이선스로 본문을 직접 표시합니다.

Abstract

This paper develops a data-enabled primal-dual framework for learning optimal control policies for unknown linear discrete-time systems from online data.

The proposed approach views the data-dependent control synthesis problem as a time-varying semidefinite program (SDP) whose coefficients are recursively updated from online closed-loop measurements.

Instead of repeatedly solving a full SDP as new data arrive, the policy is updated online through lightweight primal-dual iterations, each consisting of a linear equation solve and a projection onto the positive semidefinite cone.

The framework applies to both direct and indirect data-driven formulations and covers a broad class of control objectives, including LQR, $H_\infty$ control, and safety-critical control.

To characterize the coupling between online optimization and closed-loop data generation, we introduce two data-dependent quantities: the Sim-to-Real Gap, which measures the mismatch between noisy and noiseless data-induced SDPs, and the Difference-of-Signal, which measures the temporal variation of the SDP coefficients.

Under persistency of excitation, suitable SDP regularity conditions, and sufficiently slow data variation, we establish a local linear tracking result up to residual terms governed by the latter two quantities.

A global ergodic convergence bound is also derived for arbitrary initialization.

Numerical examples on LQR, $H_\infty$ control, and safe exploration demonstrate that the proposed method can efficiently improve control performance from online data while accommodating SDP constraints beyond the well-explored LQR policy-gradient formulations.

전문 보기

A Data-Enabled Primal-Dual Approach for Policy Learning with SDP Formulations

이 뉴스, 어떠셨어요?

Abstract

관련 뉴스

'research' 카테고리 뉴스

Constructive Alignment: Governing Preference Dynamics in Human-AI Interaction

Bounded Morality: Defining the Space of Moral Computation

The MMM Data Model -- A Normative Specification for Knowledge Interoperability in a Decentralisable Knowledge Commons

arXiv의 다른 기사

RareDxR1: Autonomous Medical Reasoning for Rare Disease Diagnosis Beyond Human Annotation

A Contextual-Bandit Oversight Game with Two-Sided Informational Asymmetry

Constructing Epistemic AI Literacy: Detecting Epistemic Aims and Processes in Student-AI Co-Programming