Locality-Aware Continual Unlearning for Diffusion Models
이 뉴스, 어떠셨어요?
한 번의 탭으로 반응을 남겨요 · 로그인 불필요
Abstract
Real-world deployment of text-to-image diffusion models requires continual concept removal as new privacy, copyright, or safety obligations arise over time.
Existing unlearning methods, however, are designed for single-step deletion and collapse after only 3-5 sequential applications.
We trace this instability to two compounding factors: (i) coarse mapping targets that cause degradation to accumulate unnecessarily across steps, and (ii) the absence of local protection for semantically neighboring concepts, whose shared internal representations make them the first to suffer collateral damage.
Because this damage is strongest in the local semantic neighborhood of the forget concept, global replay alone cannot prevent it.
Building on this analysis, we propose Locality-Aware Continual Unlearning (LACU), a framework with two complementary mechanisms.
Locality-Aware Target Selection chooses, for each forget prompt, the context-preserving mapping prompt that the diffusion model itself treats as most similar to the original prompt, measured by score-prediction distance (how differently the model denoises the same noisy image under two text conditions), ensuring each update is as small and targeted as possible.
Locality-Aware Replay uses the same metric to identify the retain concepts closest to the forget concept in the model's own representation and replays them as a local functional regularizer, directly shielding the most vulnerable neighborhood.
Combined with teacher-student distillation and lightweight $\ell_2$ parameter regularization, LACU maintains stable unlearning over 10 sequential steps, preserving significantly higher related retention ($RR_{\text{acc}}$) and general retention ($GR_{\text{acc}}$) than recent baselines.