LLMs Fail to Reproduce Human Realization Effect in Risk-Taking

ai-technology · 2026-05-26

A recent study published on arXiv (2605.25151) investigates whether large language models (LLMs) demonstrate the realization effect, a concept from behavioral economics indicating that risk preferences vary based on paper versus actual gains and losses. The researchers analyzed LLM behavior through three approaches: sensitivity to prompts alone, linear decoding of internal representations, and causal manipulation via activation steering. While prompt-only analysis revealed consistent sensitivity to conditions, the observed directional trends did not align with human expectations. Notably, a realization-status signal was identified in layer 18 of Gemma's residual stream, which generalized to unseen prompts. However, steering this signal did not consistently alter downstream risk decisions, indicating that LLMs may not authentically mimic human cognitive processes in this area.

Key facts

Study tests realization effect in LLMs
Three evaluation levels: prompt-only, linear readout, activation steering
Prompt-only results show condition sensitivity but wrong direction
Gemma's residual stream has realization-status signal at layer 18
Signal generalizes to held-out prompts
Activation steering does not reliably shift risk choices
Null result holds across conditions
Paper on arXiv: 2605.25151

LLMs Fail to Reproduce Human Realization Effect in Risk-Taking

Key facts

Entities

Institutions

Sources