TRIAD Framework Predicts Multimodal Conversational Attacks

ai-technology · 2026-05-20

A new paper on arXiv (2605.18988) introduces the Triple-tier Anomaly Defense (TRIAD) framework to defend against novel multi-turn multimodal attacks on Multimodal Large Language Models (MLLMs). The research identifies that adversaries use progressive, cross-modal perturbations across conversational trajectories to evade turn-specific guardrails. Static defenses fail due to the Markov property. TRIAD models safety verification as dynamic survival prediction, monitoring covariance shifts with a Ledoit-Wolf regularized Mahalanobis distance.

Key facts

arXiv paper 2605.18988
Title: Surviving the Unseen: Predictive Defense for Novel Multi-Turn Multimodal Attacks
Introduces TRIAD (Triple-tier Anomaly Defense) framework
Addresses non-stationary attack surface in MLLMs
Adversaries use progressive cross-modal perturbations
Static defenses constrained by Markov property
Formulates safety as dynamic survival prediction
Uses Ledoit-Wolf regularized Mahalanobis distance

TRIAD Framework Predicts Multimodal Conversational Attacks

Key facts

Entities

Institutions

Sources