ARTFEED — Contemporary Art Intelligence

ArcMark: Distortion-Free Multi-Byte LLM Watermark via Optimal Transport

ai-technology · 2026-05-25

Researchers have developed ArcMark, a new watermarking method for large language models (LLMs) that can embed multiple bytes of information into generated text without distorting token predictions. Existing watermarks typically encode a single bit per token, limiting their capacity. ArcMark, based on coding and information-theoretic principles, can reliably embed data such as user IDs, model versions, or even the prompt itself, dramatically expanding potential applications for responsible LLM use. The approach is presented in a paper on arXiv (2602.07235) and promises distortion-free multi-byte watermarking.

Key facts

  • ArcMark is a new multi-byte LLM watermarking method.
  • It embeds information without perturbing average next-token predictions.
  • Existing watermarks typically encode a single bit per token.
  • ArcMark can embed user IDs, model versions, or prompts.
  • The method is based on coding and information-theoretic principles.
  • The paper is available on arXiv with ID 2602.07235.
  • It aims to promote responsible use of large language models.
  • The approach is described as distortion-free.

Entities

Institutions

  • arXiv

Sources