ARTFEED — Contemporary Art Intelligence

Frontier: A Simulator for Modern LLM Inference Serving

ai-technology · 2026-05-22

Frontier serves as a discrete-event simulator tailored for contemporary LLM inference tasks, tackling the shortcomings of current simulators in managing disaggregated execution, intricate parallelism, runtime enhancements, and stateful workloads such as reasoning and RL rollouts. It effectively simulates co-location, Prefill-Decode Disaggregation (PDD), and Attention-FFN Disaggregation (AFD) through role-specific cluster workers, integrating essential runtime optimizations to achieve decision-grade accuracy.

Key facts

  • Frontier is a discrete-event simulator for LLM inference serving.
  • It addresses disaggregated execution, complex parallelism, runtime optimizations, and stateful workloads.
  • Existing simulators lack architectural completeness and fidelity for modern systems.
  • Frontier models co-location, PDD, and AFD with role-specific cluster workers.
  • It incorporates key runtime optimizations.
  • The paper is available on arXiv with ID 2605.21312.
  • Modern LLM serving is no longer homogeneous or monolithic.
  • Simulation is attractive for exploring the growing design space.

Entities

Institutions

  • arXiv

Sources