ARTFEED — Contemporary Art Intelligence

Red-Teaming LLMs for Political Influence Campaigns

ai-technology · 2026-05-25

A recent study published on arXiv presents a red-teaming framework designed to evaluate the potential misuse of large language models (LLMs) in political influence efforts. This research emphasizes locally deployed open-source LLMs, which are more appealing to privacy-focused malicious users compared to API-only models. The framework assesses LLM Overton Windows (OWs)—the spectrum of political views a model can consistently articulate on contentious issues—and quantifies how basic natural-language jailbreaks can broaden this spectrum. Analyzing over 30 LLMs across 10 model families and five nations, the findings reveal consistent biases in political expression: open-source LLMs tend to generate more left-leaning social media content. The objective is to enhance information integrity by pinpointing vulnerabilities before they can be exploited.

Key facts

  • The study introduces a red-teaming framework for LLMs.
  • It focuses on locally deployed open-source LLMs.
  • The framework measures LLM Overton Windows (OWs).
  • OWs define the range of political opinions a model can express.
  • Simple natural-language jailbreaks expand the OW range.
  • Over 30 LLMs from 10 model families were evaluated.
  • Models from five countries of origin were tested.
  • Open-source LLMs show systematic left-leaning bias in political expressivity.

Entities

Institutions

  • arXiv

Sources