Caption Poisoning Attacks on Text-to-Music Systems
A new study from arXiv reveals a security vulnerability in retrieval-augmented text-to-music (TTM) systems. Researchers propose a dual-layer caption poisoning attack that injects crafted music captions into a knowledge database, causing the system to retrieve malicious captions and steer generation away from the user's intended function. The attack targets the integrity dependency on the music knowledge database, without modifying the user prompt, retriever, or generator. In experiments using the MusicCaps database, CLAP retriever, and MusicGen pipeline, poisoned generations moved substantially closer to the attacker-chosen target intent. The study highlights a critical security flaw in AI-generated music systems.
Key facts
- arXiv paper 2605.30365
- Retrieval-augmented text-to-music systems vulnerable to caption poisoning
- Dual-layer caption poisoning strategy proposed
- Attack preserves high-level retrieval anchors while injecting low-level acoustic descriptors
- Experiments used MusicCaps database, CLAP retriever, and MusicGen pipeline
- Poisoned generations moved closer to attacker-chosen target intent
- No modification of user prompt, retriever, or generator required
- Integrity dependency on music knowledge database is the vulnerability
Entities
Institutions
- arXiv
- MusicCaps
- CLAP
- MusicGen