ARTFEED — Contemporary Art Intelligence

SpatialGrammar: A DSL for LLM-Based 3D Indoor Scene Generation

ai-technology · 2026-05-01

A team of researchers has introduced SpatialGrammar, a specialized language designed to create interactive 3D indoor environments from natural language inputs. This system visualizes gravity-aligned configurations as BEV grid placements, ensuring a deterministic process that leads to valid 3D geometries, which allows for reliable constraint verification. Two distinct models have been created: SG-Agent, an iterative closed-loop system that utilizes compiler feedback for refinement, and SG-Mini, a 104M-parameter model developed using synthetic data. This methodology effectively tackles spatial inaccuracies and collisions that frequently occur in LLM-driven scene generation. The findings are documented in a paper available on arXiv, identified by ID 2604.27555.

Key facts

  • SpatialGrammar is a domain-specific language for 3D indoor scene generation
  • It uses BEV grid placements with deterministic compilation to valid 3D geometry
  • SG-Agent is a closed-loop system with compiler feedback
  • SG-Mini is a 104M-parameter model trained on synthetic data
  • The system addresses spatial errors and collisions in LLM-based approaches
  • Paper published on arXiv with ID 2604.27555
  • Application areas include virtual reality, gaming, and embodied AI
  • The approach enables verifiable constraint checking

Entities

Institutions

  • arXiv

Sources