ALDEN Attack Boosts Private Data Extraction from RAG Systems via Active Learning
Researchers propose ALDEN, a novel attack that enhances private data extraction from Retrieval-Augmented Generation (RAG) systems. RAG augments large language models with external knowledge retrieval to improve reliability, but remains vulnerable to data extraction attacks where adversaries embed malicious commands into user queries. Existing attacks suffer from low extraction rates and limited practical effectiveness. ALDEN employs active learning to diversify malicious queries and introduces a decay-based dynamic algorithm to estimate topic distribution of the underlying knowledge base, guiding query generation. By combining these methods, ALDEN achieves efficient and effective extraction of private data from RAGs. The paper is available on arXiv under identifier 2605.18762.
Key facts
- ALDEN is a novel attack for extracting private data from RAG systems.
- RAG augments LLMs with external knowledge retrieval.
- Existing data extraction attacks have low rates and limited effectiveness.
- ALDEN uses active learning to diversify malicious queries.
- A decay-based dynamic algorithm estimates topic distribution of the knowledge base.
- The attack combines active learning and distribution estimation.
- The paper is on arXiv: 2605.18762.
- The attack targets vulnerabilities in RAG systems.
Entities
Institutions
- arXiv