D²-Monitor: Dynamic Safety Monitoring for Diffusion LLMs via Hesitation-Aware Routing

ai-technology · 2026-05-26

A new safety monitoring method, D²-Monitor, has been proposed for diffusion large language models (D-LLMs), which generate text through a multi-step denoising process. Unlike autoregressive LLMs, D-LLMs expose intermediate hidden representations that may contain safety-relevant information. The researchers identify 'safety hesitation'—intermediate hidden states repeatedly falling near the probe's decision boundary—as a key signal predicting probe failure. D²-Monitor uses a bi-level routing strategy to dynamically allocate monitoring resources based on this hesitation signal. The work is published on arXiv (paper 2605.25893).

Key facts

D²-Monitor is a dynamic safety monitoring method for diffusion LLMs.
Diffusion LLMs generate text via multi-step denoising, exposing intermediate hidden states.
Safety hesitation is defined as hidden states repeatedly near the probe's decision boundary.
The number of hesitation steps predicts probe failure effectively.
D²-Monitor uses bi-level routing for resource allocation.
The paper is available on arXiv with ID 2605.25893.
The method is motivated by lightweight probes for always-on monitoring.
The research addresses a gap in safety monitoring for D-LLMs.

D²-Monitor: Dynamic Safety Monitoring for Diffusion LLMs via Hesitation-Aware Routing

Key facts

Entities

Institutions

Sources