Bayesian Framework Disentangles LLM Opinion Biases
Researchers developed a Bayesian framework to isolate three biases in LLM opinion dynamics: topic bias, agreement bias, and anchoring bias. Applied to multi-step dialogues across 12 questions on climate change, societal justice, and music preferences, they found opinion trajectories converge to a shared attractor, with interaction and bias influence decaying over time. Bias impact varied between LLMs, and fine-tuning on opinionated statements altered dynamics.
Key facts
- Framework quantifies topic, agreement, and anchoring biases
- Tested on 12 questions covering climate change, societal justice, music preferences
- Opinion trajectories converge to shared attractor
- Interaction and bias influence decay over time
- Bias impact differs between LLMs
- Fine-tuning on opinionated statements affects dynamics
Entities
Institutions
- arXiv