ARTFEED — Contemporary Art Intelligence

LLMs' Internal Reasoning as a Polylogue of Persona Vectors

other · 2026-05-12

A recent paper on arXiv (2605.09159) suggests that large language models (LLMs) represent behavioral characteristics through "persona vectors" within activation space. These vectors can be tracked in real-time during generation as a "polylogue," which reflects a series of alignments between these vectors and hidden states. Testing on four models with open weights indicates that polylogue features can predict performance on MMLU-Pro comparably to low-dimensional baselines, while still being interpretable. Additionally, this method proposes specific steering targets for adjusting latent directions at various stages of response, implemented as a paragraph-conditioned intervention that enhances accuracy.

Key facts

  • arXiv paper 2605.09159
  • LLMs encode behavioural traits as persona vectors
  • Persona vectors are linear directions in activation space
  • Polylogue is the time series of alignments between persona vectors and hidden activations
  • Experiments on four open-weight models
  • Polylogue features predict correctness on MMLU-Pro
  • Competitive with low-dimensional activation baselines
  • Paragraph-conditioned intervention improves accuracy

Entities

Institutions

  • arXiv

Sources