Constructive Alignment: Governing Preference Dynamics in Human-AI Interaction

arXiv CS.AI

이 뉴스, 어떠셨어요?

한 번의 탭으로 반응을 남겨요 · 로그인 불필요

CC BY

이 매체는 공공·자유 라이선스로 본문을 직접 표시합니다.

Abstract

Most approaches to AI alignment treat human preferences as fixed targets to be inferred and optimized.

This assumption conflicts with extensive empirical evidence showing that preferences are layered, dynamic, and constructed through interaction--particularly with adaptive technologies.

As AI systems become more persistent, personalized, and socially embedded, they increasingly participate in shaping what people attend to, value, and endorse over time.

We introduce Constructive Alignment, a paradigm that reframes alignment as a control problem over evolving human preference trajectories rather than static preference satisfaction.

Drawing on behavioral economics, psychology, and constructivist social theory, we model preferences as layered state variables that evolve under interaction with AI systems.

We formalize this view using a control-theoretic framework in which system actions and interaction design jointly influence both world states and human evaluative states.

We argue that alignment is not primarily about controlling AI behavior, but about regulating how AI systems influence the evolution of human preferences--ensuring that value trajectories remain coherent, reflectively endorsed, epistemically grounded, bounded against manipulation, and empowering under uncertainty.

Alignment thus becomes a problem of governing long-term value formation rather than simply satisfying static preferences.

전문 보기

Constructive Alignment: Governing Preference Dynamics in Human-AI Interaction

이 뉴스, 어떠셨어요?

Abstract

관련 뉴스

'research' 카테고리 뉴스

Bounded Morality: Defining the Space of Moral Computation

The MMM Data Model -- A Normative Specification for Knowledge Interoperability in a Decentralisable Knowledge Commons

Making Failure Safe: A Constrained, Verifiable Agent Framework for Open-Web Data Collection

arXiv의 다른 기사

A Contextual-Bandit Oversight Game with Two-Sided Informational Asymmetry

Constructing Epistemic AI Literacy: Detecting Epistemic Aims and Processes in Student-AI Co-Programming

From Signals to Structure: How Memory Architecture Drives Language Emergence in LLM Agents