Locality-Aware Continual Unlearning for Diffusion Models

arXiv CS.AI

이 뉴스, 어떠셨어요?

한 번의 탭으로 반응을 남겨요 · 로그인 불필요

CC BY

이 매체는 공공·자유 라이선스로 본문을 직접 표시합니다.

Abstract

Real-world deployment of text-to-image diffusion models requires continual concept removal as new privacy, copyright, or safety obligations arise over time.

Existing unlearning methods, however, are designed for single-step deletion and collapse after only 3-5 sequential applications.

We trace this instability to two compounding factors: (i) coarse mapping targets that cause degradation to accumulate unnecessarily across steps, and (ii) the absence of local protection for semantically neighboring concepts, whose shared internal representations make them the first to suffer collateral damage.

Because this damage is strongest in the local semantic neighborhood of the forget concept, global replay alone cannot prevent it.

Building on this analysis, we propose Locality-Aware Continual Unlearning (LACU), a framework with two complementary mechanisms.

Locality-Aware Target Selection chooses, for each forget prompt, the context-preserving mapping prompt that the diffusion model itself treats as most similar to the original prompt, measured by score-prediction distance (how differently the model denoises the same noisy image under two text conditions), ensuring each update is as small and targeted as possible.

Locality-Aware Replay uses the same metric to identify the retain concepts closest to the forget concept in the model's own representation and replays them as a local functional regularizer, directly shielding the most vulnerable neighborhood.

Combined with teacher-student distillation and lightweight $\ell_2$ parameter regularization, LACU maintains stable unlearning over 10 sequential steps, preserving significantly higher related retention ($RR_{\text{acc}}$) and general retention ($GR_{\text{acc}}$) than recent baselines.

전문 보기

Locality-Aware Continual Unlearning for Diffusion Models

이 뉴스, 어떠셨어요?

Abstract

관련 뉴스

'research' 카테고리 뉴스

PACE: A Neuro-Symbolic Framework for Plausible and Actionable Counterfactual Explanations

Auto-FL-Research: Agentic Search for Federated Learning Algorithms

The Wiola Architecture for Efficient Small Language Models

arXiv의 다른 기사

CreativityNeuro: Steering Language Model Weights to Improve Divergent Thinking and Reduce Mode Collapse

Discrete Diffusion Language Models for Interactive Radiology Report Drafting

Beyond Next-Token Prediction: An RLVR Proof of Concept for Tool-Use Agents on Atlassian Workflows