LLM Agents Simulate Human Behavior from Self-Reports
A recent study investigates the ability of large language models (LLMs) to generate general-purpose simulations of individuals, referred to as generative agents, based on self-reported data. Researchers utilized a varied national sample of 1,052 Americans, constructing agents through two-hour semi-structured interviews (American Voices Project), structured surveys (General Social Survey and Big Five personality inventory), or a combination of both methods. The accuracy of these agents on withheld General Social Survey items was 83% (interview only), 82% (surveys only), and 86% (combined), demonstrating superior performance compared to agents relying solely on demographic data. These results indicate that LLMs can forecast human behavior across various domains without needing extensive structured datasets for each specific outcome, presenting a more versatile method for behavioral simulation.
Key facts
- Study uses LLMs to create person-specific simulations (generative agents) from self-report data.
- Sample includes 1,052 Americans from a diverse national sample.
- Data sources: semi-structured interviews (American Voices Project), structured surveys (General Social Survey, Big Five personality inventory), or both.
- Agent accuracy on held-out GSS items: 83% (interview only), 82% (surveys only), 86% (combined).
- Accuracy measured against participants' two-week test-retest consistency.
- Outperforms agents prompted only with demographics.
- Published on arXiv: 2411.10109v2.
- Enables general-purpose behavioral prediction without domain-specific structured data.
Entities
Institutions
- American Voices Project
- General Social Survey
- Big Five personality inventory
- arXiv