학술
기타
Recursive Learning Without Collapse: A Weighting-Based Stabilization Framework
arXiv Stat
조회 0
CC BY
이 매체는 공공·자유 라이선스로 본문을 직접 표시합니다.Statistics > Machine Learning
[Submitted on 25 Feb 2025 (v1), last revised 16 Jun 2026 (this version, v5)]
Title:Recursive Learning Without Collapse: A Weighting-Based Stabilization Framework
View PDF HTML (experimental)Abstract:Recent studies identified an intriguing phenomenon in recursive generative model training known as model collapse, where models trained on data generated by previous models exhibit severe performance degradation. Addressing this issue and developing more effective training strategies have become central challenges in generative model research. In this paper, we investigate this phenomenon within a novel framework, where generative models are iteratively trained on a combination of newly collected real data and synthetic data from the previous training step. To develop an optimal training strategy for integrating real and synthetic data, we evaluate the performance of a weighted training scheme in various scenarios, including Gaussian distribution estimation, generalized linear models, and nonparametric estimation. We theoretically characterize the impact of the mixing proportion and weighting scheme of synthetic data on the final model's performance. Our key finding is that, across different settings, the optimal weighting scheme under different proportions of synthetic data asymptotically follows a unified expression, revealing a fundamental trade-off between leveraging synthetic data and model performance. In some cases, the optimal weight assigned to real data corresponds to the reciprocal of the golden ratio. Finally, we validate our theoretical results on extensive simulated datasets and a real tabular dataset.
Submission history
From: Hengzhi He [view email][v1] Tue, 25 Feb 2025 10:15:16 UTC (2,207 KB)
[v2] Thu, 6 Mar 2025 16:03:59 UTC (2,207 KB)
[v3] Thu, 2 Oct 2025 07:02:19 UTC (3,246 KB)
[v4] Thu, 9 Oct 2025 03:00:19 UTC (3,246 KB)
[v5] Tue, 16 Jun 2026 03:44:04 UTC (3,033 KB)
Current browse context:
stat.ML
References & Citations
Loading...
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Connected Papers (What is Connected Papers?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
alphaXiv (What is alphaXiv?)
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Hugging Face (What is Huggingface?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
CORE Recommender (What is CORE?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.
이 뉴스, 독자들은 어떻게 느꼈나요?
첫 반응을 남겨보세요로그인하면 감정 반응에 참여할 수 있어요.
관련 뉴스
관련 뉴스 제보는 로그인 후 가능합니다.