FUS3DMaps: Dual-Layer Open-Vocabulary Semantic Mapping for Robots

other · 2026-05-07

FUS3DMaps presents an innovative online semantic mapping technique that integrates dense and instance-level open-vocabulary layers into a unified voxel map. This dual-layer framework facilitates voxel-level semantic fusion of layer embeddings, leveraging the unique advantages of both methods. It overcomes the scalability challenges faced by current training-free approaches that depend on multi-view fusion of semantic embeddings. By utilizing complete uncropped image frames, FUS3DMaps avoids the need for segmentation and 2D-to-3D instance linking. The study, which proposes a cross-layer semantic fusion method, can be found on arXiv under the identifier 2605.03669.

Key facts

FUS3DMaps is an online dual-layer semantic mapping method.
It jointly maintains dense and instance-level open-vocabulary layers.
The layers are within a shared voxel map.
It enables voxel-level semantic fusion of layer embeddings.
The method combines complementary strengths of both semantic mapping approaches.
It addresses scalability limitations of existing training-free methods.
Existing methods rely on multi-view fusion of semantic embeddings.
FUS3DMaps operates on full uncropped image frames.
It sidesteps segmentation and 2D-to-3D instance association.
The paper is published on arXiv with ID 2605.03669.

FUS3DMaps: Dual-Layer Open-Vocabulary Semantic Mapping for Robots

Key facts

Entities

Institutions

Sources