WaveGuard: Defending Text-to-Image Models from Knowledge Distillation
A new framework called WaveGuard aims to protect closed-weight text-to-image generative models from unauthorized knowledge distillation. Attackers can query APIs, collect synthetic images, and train substitute models. WaveGuard uses a frequency-aware, single-pass generator to perturb outputs under a user-specified budget, preserving visual fidelity while preventing model stealing.
Key facts
- Closed-weight generative services are deployed via query-based APIs.
- Attackers can repeatedly query APIs to collect synthetic images for training substitute models.
- WaveGuard is a single-pass, generator-based protection framework.
- WaveGuard operates under a user-specified perturbation budget.
- WaveGuard employs frequency-aware perturbation.
- The defense aims to preserve visual fidelity of released images.
- The framework scales efficiently to large-volume output release.
- The paper is on arXiv with ID 2605.22060.
Entities
Institutions
- arXiv