Structural Gap in Behavioral AI Governance Frameworks
A recent publication on arXiv (2604.27292) highlights a critical structural issue in the governance of behavioral AI. The authors identify two key boundaries for any system: expressiveness (its capabilities) and governance (the scope of governance). In most operational AI systems, these boundaries are determined separately, leading to three distinct areas: governed capabilities (the only beneficial area), ungoverned capabilities (which pose risks), and governance policies for capabilities that do not exist (theater). Two of these areas indicate failure modes. The focus of the paper is on the governance of effects—actions taken by AI systems in the real world (such as API calls, database entries, and tool usages)—as opposed to the governance of model outputs (like content quality, bias, and fairness). The authors introduce a formal framework to analyze this discrepancy, referencing Rice's theorem (1953), which demonstrates that the gap is undecidable for any Turing-complete architecture attempting to govern effects behaviorally.
Key facts
- Paper arXiv:2604.27292 analyzes structural gap in behavioral AI governance
- Two boundaries: expressiveness and governance are defined independently in most AI systems
- Three regions result: governed capabilities, ungoverned capabilities, and theater
- Two of three regions are failure modes
- Focus is on governance of effects (actions) not model outputs
- Rice's theorem (1953) proves undecidability for Turing-complete architectures
- Published on arXiv
- No algorithm can close the gap in the general case
Entities
Institutions
- arXiv