FRA-Attack: Frequency-Domain Method Improves Transferable Attacks on MLLMs
Researchers propose FRA-Attack, a frequency-domain regularized adversarial alignment method to improve transferable targeted attacks against closed-source multimodal large language models (MLLMs). The approach addresses two key challenges: spatial-domain feature redundancy and surrogate-specific gradient signals. By applying a high-pass DCT objective on patch features, FRA-Attack suppresses redundant global structures and focuses the loss on high-frequency bands that capture intrinsic visual focus shared across models. This enhances cross-model transferability of perturbations optimized on open-source surrogate encoders. The method is detailed in a paper on arXiv (2605.21541).
Key facts
- FRA-Attack is a frequency-domain regularized adversarial alignment method.
- It targets transferable attacks against closed-source MLLMs.
- The method uses a high-pass DCT objective on patch features.
- It suppresses redundant global structures in spatial-domain features.
- It focuses loss on high-frequency bands carrying intrinsic visual focus.
- The approach improves cross-model transferability of perturbations.
- Perturbations are optimized on open-source surrogate encoders.
- The paper is available on arXiv with ID 2605.21541.
Entities
Institutions
- arXiv