ARTFEED — Contemporary Art Intelligence

D²-Monitor: Dynamic Safety Monitoring for Diffusion LLMs via Hesitation-Aware Routing

ai-technology · 2026-05-26

A new safety monitoring method, D²-Monitor, has been proposed for diffusion large language models (D-LLMs), which generate text through a multi-step denoising process. Unlike autoregressive LLMs, D-LLMs expose intermediate hidden representations that may contain safety-relevant information. The researchers identify 'safety hesitation'—intermediate hidden states repeatedly falling near the probe's decision boundary—as a key signal predicting probe failure. D²-Monitor uses a bi-level routing strategy to dynamically allocate monitoring resources based on this hesitation signal. The work is published on arXiv (paper 2605.25893).

Key facts

  • D²-Monitor is a dynamic safety monitoring method for diffusion LLMs.
  • Diffusion LLMs generate text via multi-step denoising, exposing intermediate hidden states.
  • Safety hesitation is defined as hidden states repeatedly near the probe's decision boundary.
  • The number of hesitation steps predicts probe failure effectively.
  • D²-Monitor uses bi-level routing for resource allocation.
  • The paper is available on arXiv with ID 2605.25893.
  • The method is motivated by lightweight probes for always-on monitoring.
  • The research addresses a gap in safety monitoring for D-LLMs.

Entities

Institutions

  • arXiv

Sources