ARTFEED — Contemporary Art Intelligence

MUSE Benchmark Evaluates Text-to-CAD for Industrial Design

ai-technology · 2026-05-28

A new benchmark called MUSE has been developed by researchers for Text-to-CAD generation, emphasizing intricate, editable boundary representation (B-Rep) assemblies. In contrast to current benchmarks that focus on single-part CAD models through geometric similarity, MUSE evaluates functionality, manufacturability, and assemblability via a three-phase process: code verification, geometric evaluation, and design-intent alignment. The last phase utilizes design-specific criteria to assess practical design quality beyond mere shape matching. To facilitate scalable assessment, MUSE incorporates a rubric-based visual language model (VLM) judge. It connects practical design examples with structured Design Specifications, aiming to enhance text-driven 3D generation for industrial product design. This research is documented in arXiv:2605.28579.

Key facts

  • MUSE is a Text-to-CAD benchmark for complex B-Rep assemblies.
  • It evaluates functionality, manufacturability, and assemblability.
  • Uses a three-stage protocol: code check, geometric check, design-intent alignment.
  • Final stage uses design-specific rubrics.
  • Employs a rubric-based VLM judge for scalable evaluation.
  • Pairs design instances with structured Design Specifications.
  • Aims to support industrial product design.
  • Published on arXiv with ID 2605.28579.

Entities

Institutions

  • arXiv

Sources