Generalized Turing Test Offers Formal Framework for Comparing Intelligence
A recent study published on arXiv presents the Generalized Turing Test (GTT), a structured approach to assessing the abilities of various agents through their indistinguishability. The relationship A ≥ B is established when agent B, serving as a distinguisher, fails to differentiate between interactions with agent A (which is directed to mimic B) and another instance of B. This framework offers a relative measure of intelligence that is independent of specific datasets and tasks. The research delves into the comparator's framework, addressing transitivity and equivalence class ordering, while also introducing variants that involve querying, limited interaction, and fixed distinguishers. The framework is tested on contemporary models, analyzing pairwise indistinguishability over thousands of trials, revealing a structured comparison aligned with existing rankings.
Key facts
- The Generalized Turing Test (GTT) is introduced as a formal framework for comparing agent capabilities.
- The Turing comparator A ≥ B holds if B cannot distinguish between A imitating B and another instance of B.
- The framework is dataset- and task-agnostic.
- The paper studies conditions for transitivity and ordering over equivalence classes.
- Variants include querying, bounded interaction, and fixed distinguishers.
- Empirical evaluation involves pairwise indistinguishability across thousands of trials.
- Results show a stratified structure consistent with existing rankings.
- The paper is published on arXiv with ID 2605.10851.
Entities
Institutions
- arXiv