PhysMani: Physics-principled 3D World Model for Dynamic Object Manipulation
이 뉴스, 어떠셨어요?
한 번의 탭으로 반응을 남겨요 · 로그인 불필요
Abstract
Manipulating fast and dynamically moving targets in unstructured 3D environments remains challenging for embodied AI.
Existing visual-language-action models and world models struggle with accurate 3D geometry and physically meaningful forecasting.
We propose PhysMani, a framework that couples a physics-principled 3D Gaussian world model with a future-aware action policy model.
The world model learns a divergence-free Gaussian velocity field via online optimization for fast and physically grounded future dynamics prediction.
The policy model integrates the predicted 3D scene future dynamics through a learnable token based cross-attention module.
We introduce PhysMani-Bench, a dynamic manipulation benchmark with 16 tasks, and demonstrate a superior success rate over strong baselines in both simulation and real-world robot experiments.