CORDON-MAS: A New Defense Against RAG Knowledge Poisoning
A new research paper presents CORDON-MAS, a framework designed to protect retrieval-augmented generation (RAG) systems from Confundo-style knowledge poisoning. The authors highlight a gap in monitoring and control, where models can identify contradictions in tainted evidence but still respond to false information. By implementing the Cordon Principle, CORDON-MAS divides tasks such as evidence extraction, cross-source auditing, and answer synthesis among agents with differing memory privileges. In tests across five BEIR datasets, it achieves a 92.4% reduction in the success rate of attacks compared to unprotected RAG systems. This study shifts the perspective on RAG poisoning from merely detecting issues to addressing architectural control.
Key facts
- CORDON-MAS defends against Confundo-style poisoning in RAG systems.
- Models exhibit a monitoring-control gap: they detect contradictions but act on poisoned claims.
- The Cordon Principle states no agent capable of final synthesis may access untrusted natural-language evidence.
- CORDON-MAS uses compartmentalized agents with asymmetric memory privileges.
- Tested on five BEIR datasets.
- Reduces attack success rate by 92.4% relative to undefended RAG.
- Reframes RAG poisoning from detection to architectural control.
- Paper published on arXiv with ID 2605.26754.
Entities
Institutions
- arXiv