Agent-breakage: A measurement framework for autonomous Kubernetes operations

other · 2026-05-25

A new paper from arXiv (2605.23058) introduces agent-breakage, a closed-loop measurement framework for autonomous Kubernetes operations agents. The authors argue that empirical claims about such agents are largely unfalsifiable due to lack of controlled baselines, selection bias, absence of pre-registered decision matrices, and small sample sizes. The framework injects faults into a target Kubernetes cluster, observes agent responses, and scores them on four axes against ground truth, accumulating outcome-labeled tuples. It distinguishes framework error from reasoning error and supports off-condition testing. The work aims to provide a verification substrate analogous to code agents' testing environments.

Key facts

arXiv paper 2605.23058
Introduces agent-breakage measurement framework
Addresses unfalsifiability of autonomous Kubernetes operations agents
Injects faults into target Kubernetes cluster
Scores agent responses on four axes against ground truth
Distinguishes framework error from reasoning error
Supports off-condition testing
Accumulates (state, action, outcome) tuples

Agent-breakage: A measurement framework for autonomous Kubernetes operations

Key facts

Entities

Institutions

Sources