World-Ego Modeling for Long-Horizon Embodied Tasks
A novel framework termed World-Ego Modeling has been proposed to tackle the decline in performance during long-term embodied tasks, especially those that combine navigation and manipulation. This method breaks down future developments into two main components: world and ego, establishing boundaries based on motion, semantics, and intentions. The study examines three strategies for disentanglement: post-, pre-, and full disentanglement. This concept is realized through the World-Ego Model (WEM), which serves as an integrated embodied world model that includes a distinct implicit representation. The research can be accessed on arXiv with the reference number 2605.19957.
Key facts
- World-Ego Modeling is a new conceptual paradigm for embodied intelligence.
- It decomposes future evolution into world and ego components.
- The world-ego boundary is defined from motion, semantic, and intention views.
- Three disentanglement strategies are analyzed: post-, pre-, and full disentanglement.
- The paradigm is instantiated as the World-Ego Model (WEM).
- WEM is a unified embodied world model with implicit separate coupling.
- The approach targets long-horizon hybrid tasks with navigation and manipulation.
- The paper is arXiv:2605.19957.
Entities
Institutions
- arXiv