Impossibility Theorems as Design Rules for Trustworthy AI

ai-technology · 2026-05-25

A recent thesis published on arXiv (2605.23024) reinterprets key impossibility results from Turing, Arrow, and No Free Lunch theorems as essential guidelines for developing reliable AI systems. The primary conclusion identifies a performance limit dictated solely by the architecture: beyond a certain depth of reasoning, no level of training—irrespective of adapter rank, sample size, or loss function—can enhance results. This Deterministic Horizon, which can be calculated prior to deployment based on layer count and embedding width, varies between 19 and 31 across twelve transformer architectures. Fine-tuning with optimal-length traces yields less than a four percentage point improvement. Additionally, a lower bound for circuit complexity in modular reasoning is established, revealing super-exponential accuracy decline beyond the horizon.

Key facts

Thesis turns impossibility results into design rules for trustworthy AI.
Accuracy ceiling is set by architecture alone, independent of training data or compute.
Deterministic Horizon measured between 19 and 31 across 12 transformer architectures.
Fine-tuning on optimal-length traces recovers under 4 percentage points.
Mechanism is a capacity invariant of the residual stream.
Information-theoretic conversion yields super-exponential accuracy decay past the horizon.
Unconditional circuit-complexity lower bound for modular reasoning.
Horizon computable before deployment from layer count and embedding width.

Impossibility Theorems as Design Rules for Trustworthy AI

Key facts

Entities

Institutions

Sources