SIDA: Synthetic Image Driven Zero-shot Domain Adaptation

arXiv CS.AI

이 뉴스, 어떠셨어요?

한 번의 탭으로 반응을 남겨요 · 로그인 불필요

CC BY

이 매체는 공공·자유 라이선스로 본문을 직접 표시합니다.

Abstract

Zero-shot domain adaptation is a method for adapting a model to a target domain without utilizing target domain image data.

To enable adaptation without target images, existing studies utilize CLIP's embedding space and text description to simulate target-like style features.

Despite the previous achievements in zero-shot domain adaptation, we observe that these text-driven methods struggle to capture complex real-world variations and significantly increase adaptation time due to their alignment process.

Instead of relying on text descriptions, we explore solutions leveraging image data, which provides diverse and more fine-grained style cues.

In this work, we propose SIDA, a novel and efficient zero-shot domain adaptation method leveraging synthetic images.

To generate synthetic images, we first create detailed, source-like images and apply image translation to reflect the style of the target domain.

We then utilize the style features of these synthetic images as a proxy for the target domain.

Based on these features, we introduce Domain Mix and Patch Style Transfer modules, which enable effective modeling of real-world variations.

In particular, Domain Mix blends multiple styles to expand the intra-domain representations, and Patch Style Transfer assigns different styles to individual patches.

We demonstrate the effectiveness of our method by showing state-of-the-art performance in diverse zero-shot adaptation scenarios, particularly in challenging domains.

Moreover, our approach achieves high efficiency by significantly reducing the overall adaptation time.

전문 보기

SIDA: Synthetic Image Driven Zero-shot Domain Adaptation

이 뉴스, 어떠셨어요?

Abstract

관련 뉴스

'research' 카테고리 뉴스

AI-Model Network: Concept, Current State and Future

When Does Personality Composition Matter for Multi-Agent LLM Teams?

Internalizing the Future: A Unified Agentic Training Paradigm for World Model Planning

arXiv의 다른 기사

MER-R1: Multimodal Emotion Reasoning via Slow-Fast Thinking Synergy

ToE: A Hierarchical and Explainable Claim Verification Framework with Dynamic Multi-source Evidence Retrieval and Aggregation

Towards Reliable and Robust LLM Planning: Symbolic Feedback-Driven Iterative Self-Refinement Framework