ARTFEED — Contemporary Art Intelligence

Geopolitical bias in LLMs originates in post-training, not pre-training

ai-technology · 2026-05-25

A recent study published on arXiv (2605.23825) examined seven pairs of open-weight LLMs—base models (pre-training only) and chat models (pre-training plus post-training)—from seven different laboratories. Using a paired-scenario forced-choice probe, the analysis covered 28 country pairs in English, French, and Chinese. Findings indicate that geopolitical bias arises during post-training, not pre-training. In six out of seven AI labs, post-training altered model preferences to favor the developer's country or region. Notably, Alibaba's Qwen 2.5 exhibited the most significant change: the base model was neutral regarding China-favorability (-0.15 log-odds, p=0.15), while the chat version surged to +2.91 (p<10^-4), an 18-fold increase in odds. Other models also displayed bias shifts depending on the prompt's language, challenging the belief that bias is solely a result of pre-training data.

Key facts

  • Geopolitical bias in LLMs originates in post-training, not pre-training.
  • Seven open-weight LLM pairs from seven labs were tested.
  • Probe used 28 country pairs in English, French, and Chinese.
  • Six of seven labs showed bias shifts toward the developer's country after post-training.
  • Alibaba's Qwen 2.5 showed the strongest shift: from -0.15 to +2.91 log-odds.
  • Shift magnitude depends on the language of the prompt.
  • Study published on arXiv with ID 2605.23825.

Entities

Institutions

  • Alibaba
  • Qwen

Sources