AI Council Framework Reduces Artificial Consensus in Multi-Agent Policy Simulation
A recent study published on arXiv (2604.26561) presents the AI Council, a framework for multi-agent policy simulation that unfolds in three phases, utilizing large language models (LLMs). This research tackles the issue of artificial consensus, where evaluator agents tend to agree on a single option, irrespective of their assigned value perspectives. In a series of 120 deliberations across two policy areas—child welfare and housing—two interventions were evaluated. The first, architectural heterogeneity, involved using different 7-9B parameter models for each value perspective, leading to a notable decrease in first-choice concentration from 70.9% to 46.1% in child welfare (p < 0.001, r = 0.58) and from 46.0% to 22.9% in housing (p < 0.001, r = 0.50). This finding contrasts with accuracy-focused multi-agent debates, where diversity does not lessen convergence, indicating that model variety functions differently in scenarios lacking a clear correct answer. The second intervention, coherence validation, employs a frontier model to evaluate consistency.
Key facts
- arXiv paper 2604.26561 introduces the AI Council framework
- Three-phase deliberation framework for multi-agent policy simulation
- 120 deliberations conducted across two policy scenarios
- Architectural heterogeneity reduces first-choice concentration significantly
- Child welfare scenario: 70.9% to 46.1% reduction
- Housing scenario: 46.0% to 22.9% reduction
- Statistical significance: p < 0.001 for both scenarios
- Effect sizes: r = 0.58 (child welfare), r = 0.50 (housing)
Entities
Institutions
- arXiv