Frontier LLMs Converge on Uniform Assistant Personalities
A large-scale experiment analyzing frontier LLM personalities across 144 traits using external ELO-based scoring reveals that all tested models converge on a systematic, methodical, and analytical trait expression while suppressing remorseful and sycophantic traits. Models diverge more in middle-of-distribution traits like poetic or playful, but even creative models maintain neutral identities. This uniformity suggests an implicit emergence of a standard for optimal assistant behavior, highlighting a tacit consensus among model developers despite varied training methods.
Key facts
- Large-scale experiment on frontier LLM personalities using external ELO-based traits scoring across 144 traits.
- All models tested converge on systematic, methodical, and analytical trait expression.
- Models suppress traits such as remorseful and sycophantic.
- Models diverge more in middle-of-distribution traits like poetic or playful.
- Even creative models tend to have more neutral identities.
- Similarities suggest implicit emergence of a standard of optimal assistant behavior.
- Character training stands out for its uniformity across varied training methods.
- Study offers insight into tacit consensus between model developers.
Entities
Institutions
- arXiv