ARTFEED — Contemporary Art Intelligence

Dooly: Configuration-Agnostic Profiling for LLM Inference Simulation

ai-technology · 2026-05-11

A new system called Dooly, described in a preprint on arXiv (2605.07985), addresses the high cost of profiling large language model (LLM) inference configurations. Traditional profile-based simulators require re-profiling every operation from scratch for each configuration, making exploration expensive. Dooly exploits structural understanding: input dimensions are fixed by model configuration or request-dependent, and many configuration values (e.g., head size, layer count) recur across models. By performing a single inference pass and labeling operations, Dooly achieves configuration-agnostic, redundancy-aware profiling, enabling efficient simulation across hardware, serving engines, attention backends, and model architectures.

Key facts

  • Dooly is a configuration-agnostic, redundancy-aware profiling system for LLM inference simulation.
  • It is described in arXiv preprint 2605.07985.
  • Traditional profile-based simulators hardcode operation sets and re-profile from scratch.
  • Dooly performs a single inference pass and labels operations.
  • It exploits the fact that many model-configuration values recur across models.
  • Dooly enables efficient exploration of hardware, serving engines, attention backends, and model architectures.

Entities

Institutions

  • arXiv

Sources