LLMs Show Positional Bias in Semantic Sensitivity Testing Framework

ai-technology · 2026-04-22

An experimental framework capable of scaling has been developed to methodically assess the sensitivity of LLMs to minor semantic variations in document comparisons, likened to finding a needle in a haystack. Researchers incorporated single sentences with semantic alterations within a broader context across tens of thousands of document pairs. They evaluated five LLMs while experimenting with different types of perturbations, such as negations, conjunction swaps, and changes to named entities. The context types included original content compared to topically irrelevant material, varying the position of the needle and the length of documents. The findings indicated that LLMs demonstrate a positional bias within documents, differing from previously observed candidate-order effects, with most models imposing stricter penalties on earlier semantic differences. When altered sentences were placed within unrelated contexts, similarity scores consistently dropped. This framework is outlined in arXiv preprint 2604.18835v1, recognized as a cross-type publication, offering a multifaceted perspective on how LLMs handle subtle semantic changes in text comparison tasks.

Key facts

Scalable experimental framework tests LLM sensitivity to semantic changes
Analogized as needle-in-a-haystack problem with single altered sentences
Five LLMs tested on tens of thousands of document pairs
Varied perturbation types: negation, conjunction swaps, named entity replacements
Context types: original vs. topically unrelated material
LLMs show within-document positional bias penalizing early differences more harshly
Topically unrelated context systematically lowers similarity scores
arXiv preprint 2604.18835v1 announced as cross-type publication

Entities

—

Sources

arXiv cs.AI — 2026-04-22