LLMs Predict Missing Public Opinion in Historical Surveys
Researchers have developed a new method that uses large language models (LLMs) to predict missing responses in repeated cross-sectional surveys. This framework can estimate opinions from years where data is absent and even forecast views that weren’t asked about at all. By combining embeddings for questions, respondents, and survey years, they tested it on the General Social Surveys (GSS) spanning 1972 to 2021. The LLM models effectively uncovered hidden opinions and estimated public sentiments from other organizations for years when the GSS didn’t ask certain questions. This innovation helps recover overlooked trends and identify changes in public attitudes, like growing support for specific issues, despite the limitations of annual survey questions.
Key facts
- LLM-based framework predicts missing survey responses
- Two applications: retrodiction and unasked opinion prediction
- Tested on 1972-2021 General Social Surveys data
- Models perform strongly in cross-validation
- Can predict opinions from other organizations
- Enables recovery of missing trends
- Identifies timing of attitude changes
- Published on arXiv: 2305.09620
Entities
Institutions
- arXiv
- General Social Surveys