ARTFEED — Contemporary Art Intelligence

New RL Method Reduces Covert Political Bias in LLMs

ai-technology · 2026-05-23

A team of researchers has discovered a technique aimed at diminishing covert political bias in large language models (LLMs). Their findings reveal that LLMs display systematic political bias, particularly in sensitive contexts, where they treat topics from opposing political viewpoints unevenly—a situation referred to as covert political bias. They categorized this bias into seven operational techniques. To quantify it, they proposed two metrics: Sentiment Consistency, which assesses the symmetry of rhetoric across paired political prompts, and Helpfulness Consistency, which evaluates the depth and engagement of responses. To mitigate both biases, they created Political Consistency Training (PCT), a reinforcement learning method featuring two complementary approaches. Results indicate that PCT maintains overall helpfulness while significantly reducing covert political bias and generalizing to held-out benchmarks. This research is available on arXiv.

Key facts

  • LLMs exhibit systematic political bias across sensitive contexts
  • Covert political bias refers to asymmetric handling of counterpart topics from opposing political sides
  • 7 categories of techniques for covert political bias identified
  • Sentiment Consistency metric measures symmetry in rhetoric and framing
  • Helpfulness Consistency metric measures symmetric depth and engagement
  • Political Consistency Training (PCT) is an RL training method
  • PCT includes Sentiment Consistency Training and Helpfulness Consistency Training
  • PCT preserves overall helpfulness and reduces covert political bias
  • PCT generalizes to held-out benchmarks
  • Work released on arXiv

Entities

Institutions

  • arXiv

Sources