Compiling Agentic Workflows into LLM Weights Cuts Costs by 100x

ai-technology · 2026-05-23

A new arXiv paper (2605.22502) proposes compiling agentic workflows directly into LLM weights via fine-tuning, creating 'subterranean agents' that achieve near-frontier quality at two orders of magnitude less cost. Current agent orchestration frameworks (LangGraph, CrewAI, Google ADK, OpenAI Agents SDK, Semantic Kernel, Strands, LlamaIndex) exceed 290,000 GitHub stars but rely on external orchestrators that inject instructions and routing decisions each turn. Prior work (SimpleTOD, FireAct, SynTOD, WorkflowLLM, Agent Lumos) demonstrated the technique, yet developer adoption has favored orchestration. The approach addresses context window consumption, frontier model dependency, and exposure of proprietary procedures to third-party providers.

Key facts

arXiv paper 2605.22502 proposes compiling agentic workflows into LLM weights
Subterranean agents achieve near-frontier quality at two orders of magnitude less cost
Agent orchestration frameworks exceed 290,000 GitHub stars
Frameworks include LangGraph, CrewAI, Google ADK, OpenAI Agents SDK, Semantic Kernel, Strands, LlamaIndex
Prior work includes SimpleTOD, FireAct, SynTOD, WorkflowLLM, Agent Lumos
Current architecture uses external orchestrator above the LLM
Approach resolves context window consumption, frontier model dependency, and proprietary procedure exposure
Developer adoption has overwhelmingly favored orchestration

Entities

Institutions

LangGraph
CrewAI
Google ADK
OpenAI Agents SDK
Semantic Kernel
Strands
LlamaIndex
SimpleTOD
FireAct
SynTOD
WorkflowLLM
Agent Lumos

Sources

arXiv cs.AI — 2026-05-23