WaveGuard: Defending Text-to-Image Models from Knowledge Distillation

ai-technology · 2026-05-23

A new framework called WaveGuard aims to protect closed-weight text-to-image generative models from unauthorized knowledge distillation. Attackers can query APIs, collect synthetic images, and train substitute models. WaveGuard uses a frequency-aware, single-pass generator to perturb outputs under a user-specified budget, preserving visual fidelity while preventing model stealing.

Key facts

Closed-weight generative services are deployed via query-based APIs.
Attackers can repeatedly query APIs to collect synthetic images for training substitute models.
WaveGuard is a single-pass, generator-based protection framework.
WaveGuard operates under a user-specified perturbation budget.
WaveGuard employs frequency-aware perturbation.
The defense aims to preserve visual fidelity of released images.
The framework scales efficiently to large-volume output release.
The paper is on arXiv with ID 2605.22060.

WaveGuard: Defending Text-to-Image Models from Knowledge Distillation

Key facts

Entities

Institutions

Sources