ARTFEED — Contemporary Art Intelligence

Frontier model cuts LLM costs by routing 80% of failures to cheap agent

other · 2026-04-29

Mendral reduced LLM costs by using a model hierarchy: a cheap Haiku agent triages CI failures, routing only 20% to expensive Opus. Haiku detects duplicates via exact and semantic search, costing 25x less than full investigation. Opus plans investigations and spawns Haiku sub-agents for focused tasks, never reading raw logs. The system uses SQL interface to ClickHouse for log access, avoiding prompt bloat. A case study shows Opus diagnosing a pnpm install failure (missing 'make') by querying trend data and git history via sub-agents. Haiku handles 65% of tokens but only 36% of spend; without hierarchy, daily bill doubles. The pattern applies to high-volume event streams like security logs or IoT telemetry.

Key facts

  • Opus 4.6 costs less than Sonnet 4.0 due to model hierarchy
  • 80% of CI failures never reach Opus; triager match costs 25x less
  • Haiku agent uses exact matching and semantic search (pgvector) for duplicate detection
  • Agents access logs via SQL interface to ClickHouse, not pushed prompts
  • Opus spawns Haiku sub-agents for digging; sub-agents capped at one level deep
  • Case study: Opus diagnosed 'gyp ERR! not found: make' via sub-agents querying git log and ClickHouse
  • Haiku handles ~65% of input tokens but ~36% of LLM spend
  • Pattern generalizes to security logs, IoT telemetry, financial data

Entities

Institutions

  • Mendral
  • Hacker News
  • ClickHouse
  • GitHub CLI

Sources