ADMIT Attack Manipulates RAG Fact-Checking via Few-Shot Poisoning
Researchers propose ADMIT (Adversarial Multi-Injection Technique), a few-shot knowledge poisoning attack targeting Retrieval-Augmented Generation (RAG) systems used for fact-checking. Unlike prior work assuming easy manipulation, ADMIT operates in realistic settings where credible evidence dominates the retrieval pool. The attack injects semantically aligned adversarial content into knowledge bases, flipping fact-checking decisions and generating deceptive justifications—all without access to target LLMs, retrievers, or token-level control. This extends knowledge poisoning to fact-checking, demonstrating that even with authentic supporting or refuting evidence, LLMs remain vulnerable to manipulated context.
Key facts
- ADMIT is a few-shot, semantically aligned poisoning attack.
- It targets RAG-based fact-checking systems.
- The attack flips fact-checking decisions and induces deceptive justifications.
- It requires no access to target LLMs, retrievers, or token-level control.
- Prior work highlighted LLMs' susceptibility to misleading retrieved content.
- Real-world fact-checking scenarios have credible evidence dominating the retrieval pool.
- ADMIT extends knowledge poisoning to the fact-checking setting.
- The attack injects adversarial content into knowledge bases.
Entities
—