SpatialGrammar: A DSL for LLM-Based 3D Indoor Scene Generation
A team of researchers has introduced SpatialGrammar, a specialized language designed to create interactive 3D indoor environments from natural language inputs. This system visualizes gravity-aligned configurations as BEV grid placements, ensuring a deterministic process that leads to valid 3D geometries, which allows for reliable constraint verification. Two distinct models have been created: SG-Agent, an iterative closed-loop system that utilizes compiler feedback for refinement, and SG-Mini, a 104M-parameter model developed using synthetic data. This methodology effectively tackles spatial inaccuracies and collisions that frequently occur in LLM-driven scene generation. The findings are documented in a paper available on arXiv, identified by ID 2604.27555.
Key facts
- SpatialGrammar is a domain-specific language for 3D indoor scene generation
- It uses BEV grid placements with deterministic compilation to valid 3D geometry
- SG-Agent is a closed-loop system with compiler feedback
- SG-Mini is a 104M-parameter model trained on synthetic data
- The system addresses spatial errors and collisions in LLM-based approaches
- Paper published on arXiv with ID 2604.27555
- Application areas include virtual reality, gaming, and embodied AI
- The approach enables verifiable constraint checking
Entities
Institutions
- arXiv