LLM Vulnerability Detection Exploited via Contextual Bias Injection

ai-technology · 2026-04-25

A study on arXiv (2603.18740) reveals that Large Language Models (LLMs) used in Automated Code Review (ACR) are susceptible to the framing effect, where the presentation of information overrides semantic content in forming judgments. Researchers found that adversaries can exploit this through contextual-bias injection, crafting pull request (PR) metadata to bias security judgments as a supply-chain attack vector. The study tested 6 LLMs under five framing conditions, finding bug-free framing produced the strongest effect. This poses a risk to real-world ACR pipelines integrating LLMs as interactive assistants or autonomous agents in CI/CD workflows.

Key facts

arXiv paper 2603.18740 studies LLM vulnerability detection in ACR
Framing effect influences LLM judgments in code review
Contextual-bias injection is a supply-chain attack vector
6 LLMs tested under five framing conditions
Bug-free framing produced the strongest effect
Attack targets PR metadata in CI/CD pipelines
LLMs used as interactive assistants or autonomous agents
Study is large-scale and exploratory

LLM Vulnerability Detection Exploited via Contextual Bias Injection

Key facts

Entities

Institutions

Sources