Controllable Diffusion-Based Lesion Inpainting for Scalable Histopathology Data Augmentation
이 뉴스, 어떠셨어요?
한 번의 탭으로 반응을 남겨요 · 로그인 불필요
Abstract
Expert-annotated training data remains the critical bottleneck for AI in histopathology, particularly for rare pathologies where even dozens of cases may be unavailable.
While data augmentation offers a solution, existing methods fail to generate sufficiently realistic lesion morphologies that preserve tissue-specific architectures.
Here we present PathoGen, a diffusion-based generative model enabling controllable, high-fidelity lesion inpainting into benign histopathology images.
We validate PathoGen across four datasets representing kidney, skin, breast, and prostate pathology.
Quantitative assessment confirms PathoGen outperforms state-of-the-art baselines in image fidelity and distributional similarity.
Evaluation by six expert pathologists revealed that synthetic images by PathoGen were only marginally distinguished from real tissue image slightly above chance (57.75% accuracy), demonstrating strong perceptual realism of PathoGen-generated lesions.
PathoGen achieved the highest win rate (35.4%) when pathologists ranked generation quality against all baselines.
Crucially, augmenting training sets with PathoGen-synthesized lesions improves segmentation Dice scores by up to 0.18 compared to traditional augmentations, with maximum benefit in data-scarce regimes.
By simultaneously generating realistic morphology and pixel-level annotations, PathoGen effectively addresses both data scarcity and annotation cost, two critical bottlenecks in computational pathology development.