SceneCode: AI Generates Editable Indoor Scenes with Articulated Objects from Text
A new AI framework called SceneCode generates executable, code-driven indoor worlds from natural language prompts, enabling editable scenes with articulated objects. Unlike traditional pipelines that produce static meshes, SceneCode uses a room-level agentic backbone to create structured house layouts and emits per-object AssetRequests through a planner-designer-critic loop. Each request is then routed to on-demand asset generation, allowing for object-level controllability and the production of new interactable assets. The work addresses limitations in indoor scene synthesis for embodied AI, robotic manipulation, and simulation-based policy evaluation.
Key facts
- SceneCode compiles natural language prompts into executable, code-driven indoor worlds.
- It generates editable scenes with articulated objects, unlike static mesh pipelines.
- A room-level agentic backbone creates structured house layouts.
- Per-object AssetRequests are emitted through a planner-designer-critic loop.
- The framework enables on-demand production of interactable assets.
- It targets applications in embodied AI, robotic manipulation, and simulation-based policy evaluation.
- The paper is available on arXiv with ID 2605.19587.
- SceneCode is described as a framework for programmatic world generation.
Entities
Institutions
- arXiv