LLMs Show Genre Bias in News Credibility Assessment

ai-technology · 2026-05-06

A new study on arXiv reveals that large language models (LLMs) exhibit a genre-specific asymmetry when assessing news credibility, being more likely to misclassify legitimate entertainment news as fake compared to hard news. The research, using the GossipCop dataset from FakeNewsNet, tested four frontier models in a zero-shot setting. DeepSeek-V3.2 and GPT-5.2 showed false-positive-rate gaps of 10.1 and 8.8 percentage points respectively (both p < .001), while Claude Opus 4.6 and Gemini 3 Flash showed no significant difference. A style-swap experiment indicated the bias is not solely due to stylistic register. Prompt-based mitigation, such as framing the model as an entertainment-news fact-checker, reduced false positives for DeepSeek-V3.2 but was not universally effective.

Key facts

Study examines LLM bias in news credibility across genres
Uses GossipCop dataset from FakeNewsNet
DeepSeek-V3.2 shows 10.1 percentage point false-positive-rate gap
GPT-5.2 shows 8.8 percentage point gap
Claude Opus 4.6 and Gemini 3 Flash show no significant asymmetry
Style-swap experiment yields limited changes
Prompt-based mitigation reduces false positives for DeepSeek-V3.2
Published on arXiv with ID 2605.01727

LLMs Show Genre Bias in News Credibility Assessment

Key facts

Entities

Institutions

Sources