ARTFEED — Contemporary Art Intelligence

XAttnMark: Cross-Attention Audio Watermarking for Generative AI

ai-technology · 2026-05-25

Researchers have introduced XAttnMark (Cross-Attention Robust Audio Watermark), a neural network-based method to embed imperceptible watermarks in audio, addressing copyright and deepfake concerns. The system uses partial parameter sharing between generator and detector, a cross-attention mechanism for message retrieval, and a temporal conditioning module. A psychoacoustic-aligned time-frequency masking loss enhances imperceptibility. The method aims to jointly optimize robust detection and accurate attribution, overcoming limitations of prior techniques like WavMark and AudioSeal.

Key facts

  • XAttnMark stands for Cross-Attention Robust Audio Watermark.
  • It is introduced in arXiv paper 2502.04230.
  • The method targets copyright infringement and deepfake audio.
  • It uses partial parameter sharing between generator and detector.
  • A cross-attention mechanism enables efficient message retrieval.
  • A temporal conditioning module improves message distribution.
  • A psychoacoustic-aligned TF masking loss captures frequency masking.
  • Prior methods include WavMark and AudioSeal.

Entities

Institutions

  • arXiv

Sources