Homogeneous Multi-Agent LLM Debate Fails to Filter Hallucinations

ai-technology · 2026-05-06

A recent study indicates that multi-agent debates among uniform large language models (LLMs) do not effectively eliminate hallucinations, which contradicts prevalent beliefs. Researchers from an unnamed institution performed controlled tests using teams of ten identical LLMs—Qwen2.5-7B, Llama-3.1-8B, and Ministral-3-8B—across three rounds of debate on two challenging benchmarks, GSM-Hard and MMLU-Hard. They assessed peer debates against isolated self-correction and a stochastic noise control that introduced rationales from unrelated issues. The research highlights three pathways of failure: sycophantic conformity (up to 85.5% modal adoption), contextual fragility (vulnerability rate reaching 70.0%), and consensus collapse, where plurality voting results in incorrect agreements. These results imply that isolated self-correction frequently surpasses unguided homogeneous multi-agent debates, questioning the effectiveness of current LLM debate methods.

Key facts

Multi-agent debate among homogeneous LLMs fails to filter hallucinations.
Study used teams of N=10 homogeneous agents: Qwen2.5-7B, Llama-3.1-8B, Ministral-3-8B.
Experiments conducted across R=3 debate rounds on GSM-Hard and MMLU-Hard benchmarks.
Compared peer debate against isolated self-correction and stochastic noise control.
Sycophantic conformity: modal adoption up to 85.5%.
Contextual fragility: vulnerability rate up to 70.0%.
Consensus collapse identified as a failure pathway.
Isolated self-correction often outperforms unguided homogeneous multi-agent debate.

Entities

—

Sources

arXiv cs.AI — 2026-05-05