ARTFEED — Contemporary Art Intelligence

Shared Confidence Features Across Languages in LLMs

ai-technology · 2026-06-01

A recent study published on arXiv (2605.31220) examines if multilingual large language models can encode confidence features that are transferable across languages. Researchers utilized a simple linear probe trained solely on English, discovering it can accurately predict answer correctness in a zero-shot manner across various typologically distinct languages without needing supervision in the target language. The findings indicate that confidence features are primarily located in the middle layers of the model, implying a common confidence subspace. While performance is influenced by the similarity to the source language, this method circumvents the need for retraining for each individual language, addressing a significant gap in confidence estimation research, which has predominantly centered on English despite the use of multilingual LLMs.

Key facts

  • Study from arXiv 2605.31220
  • Focuses on zero-shot cross-lingual confidence estimation
  • Uses a lightweight linear probe trained on English
  • Probe generalizes to unseen languages without target-language supervision
  • Confidence features concentrate in middle layers across languages
  • Performance depends on similarity to source language
  • Addresses lack of multilingual confidence estimation research

Entities

Institutions

  • arXiv

Sources