Language Models Show Human-Like Plausibility Judgments
A new study from arXiv (2507.12553) reveals that language models (LMs) can reliably categorize sentences by modal category—whether an event is possible, impossible, or nonsensical—contrary to prior findings. Researchers identified linear representations called modal difference vectors within LMs that discriminate between these categories. Analysis shows these vectors emerge in a consistent order as models improve through training steps, layers, and parameter count. The work challenges earlier doubts about LMs' modal reasoning abilities (Michaelov et al., 2025; Kauf et al., 2023).
Key facts
- Study identifies modal difference vectors in LMs
- LMs can categorize sentences by modality reliably
- Modal difference vectors emerge consistently with model competence
- Contradicts earlier studies by Michaelov et al. and Kauf et al.
- Published on arXiv under ID 2507.12553
- Research focuses on event plausibility judgments
- Vectors appear through training steps, layers, and parameter count
- Linear representations discriminate between modal categories
Entities
Institutions
- arXiv