Deformba: Adaptive State Fusion for Vision SSMs
A new research paper introduces Deformba, a context-adaptive method for Vision State Space Models (SSMs). SSMs offer linear-time complexity and strong sequence modeling but struggle with vision tasks due to fixed scanning methods and limited query-based interactions. Deformba dynamically augments spatial structural information to improve multi-view 3D fusion and other perception tasks. The paper is available on arXiv.
Key facts
- arXiv:2605.21308
- Deformba is a context-adaptive method for Vision SSMs
- SSMs have linear-time complexity
- Existing vision SSMs use manually designed fixed scanning methods
- Deformba dynamically augments spatial structural information
- Addresses limitations in multi-view 3D fusion
- Published on arXiv
- Announce type: cross
Entities
Institutions
- arXiv