RAG-Pull Attack Exploits Unicode to Inject Malicious Code via Retrieval-Augmented Generation
Researchers have developed RAG-Pull, a black-box attack that exploits Retrieval-Augmented Generation (RAG) systems by inserting invisible Unicode characters into queries or external code repositories. This manipulation redirects retrieval toward attacker-controlled snippets, breaking the model's safety alignment. The attack can achieve near-perfect success when both query and target are perturbed, enabling exploits like remote code execution and SQL injection. RAG-Pull represents a new class of attacks on LLMs, highlighting vulnerabilities in RAG's reliance on external data.
Key facts
- RAG-Pull is a black-box attack on Retrieval-Augmented Generation systems.
- It inserts hidden UTF characters into queries or external code repositories.
- The attack redirects retrieval toward malicious code.
- Combined query-and-target perturbations achieve near-perfect success.
- Exploits include remote code execution and SQL injection.
- The attack breaks the model's safety alignment.
- Minimal perturbations can increase preference for unsafe code.
- The research was published on arXiv (2510.11195).
Entities
Institutions
- arXiv