Peer Identity Bias in Multi-Agent LLM Evaluation
A recent empirical investigation presents the inaugural systematic assessment of identity-related scoring bias within the TRUST democratic discourse analysis framework, which reveals how its large language model (LLM) elements interact with peer model identities through various structural pathways. This study encompasses four model families and two levels of anonymization across 30 political statements. The key discovery indicates that single-channel anonymization results in minimal bias effects, as the opposing influences of individual channels tend to negate one another, potentially misleading evaluators into believing that identity bias is non-existent. In contrast, only full-pipeline anonymization uncovers the actual trend: homogeneous groups enhance identity-driven sycophancy when model identity is entirely visible, whereas a heterogeneous setup produces a different outcome. This research underscores the necessity for thorough bias evaluation in multi-agent LLM systems.
Key facts
- First systematic measurement of identity-dependent scoring bias in TRUST pipeline
- Four model families tested with two anonymization scopes across 30 political statements
- Single-channel anonymization produces near-zero bias effects due to cancellation
- Full-pipeline anonymization reveals true bias pattern
- Homogeneous ensembles amplify identity-driven sycophancy when identity is visible
- Heterogeneous production configuration shows different effect
- Study published on arXiv with ID 2604.22971
- Research exposes LLM components to peer model identity through multiple structural channels
Entities
Institutions
- arXiv