Agent-breakage: A measurement framework for autonomous Kubernetes operations
A new paper from arXiv (2605.23058) introduces agent-breakage, a closed-loop measurement framework for autonomous Kubernetes operations agents. The authors argue that empirical claims about such agents are largely unfalsifiable due to lack of controlled baselines, selection bias, absence of pre-registered decision matrices, and small sample sizes. The framework injects faults into a target Kubernetes cluster, observes agent responses, and scores them on four axes against ground truth, accumulating outcome-labeled tuples. It distinguishes framework error from reasoning error and supports off-condition testing. The work aims to provide a verification substrate analogous to code agents' testing environments.
Key facts
- arXiv paper 2605.23058
- Introduces agent-breakage measurement framework
- Addresses unfalsifiability of autonomous Kubernetes operations agents
- Injects faults into target Kubernetes cluster
- Scores agent responses on four axes against ground truth
- Distinguishes framework error from reasoning error
- Supports off-condition testing
- Accumulates (state, action, outcome) tuples
Entities
Institutions
- arXiv