ARTFEED — Contemporary Art Intelligence

FRA-Attack: Frequency-Domain Method Improves Transferable Attacks on MLLMs

ai-technology · 2026-05-23

Researchers propose FRA-Attack, a frequency-domain regularized adversarial alignment method to improve transferable targeted attacks against closed-source multimodal large language models (MLLMs). The approach addresses two key challenges: spatial-domain feature redundancy and surrogate-specific gradient signals. By applying a high-pass DCT objective on patch features, FRA-Attack suppresses redundant global structures and focuses the loss on high-frequency bands that capture intrinsic visual focus shared across models. This enhances cross-model transferability of perturbations optimized on open-source surrogate encoders. The method is detailed in a paper on arXiv (2605.21541).

Key facts

  • FRA-Attack is a frequency-domain regularized adversarial alignment method.
  • It targets transferable attacks against closed-source MLLMs.
  • The method uses a high-pass DCT objective on patch features.
  • It suppresses redundant global structures in spatial-domain features.
  • It focuses loss on high-frequency bands carrying intrinsic visual focus.
  • The approach improves cross-model transferability of perturbations.
  • Perturbations are optimized on open-source surrogate encoders.
  • The paper is available on arXiv with ID 2605.21541.

Entities

Institutions

  • arXiv

Sources