TRIAD Framework Predicts Multimodal Conversational Attacks
A new paper on arXiv (2605.18988) introduces the Triple-tier Anomaly Defense (TRIAD) framework to defend against novel multi-turn multimodal attacks on Multimodal Large Language Models (MLLMs). The research identifies that adversaries use progressive, cross-modal perturbations across conversational trajectories to evade turn-specific guardrails. Static defenses fail due to the Markov property. TRIAD models safety verification as dynamic survival prediction, monitoring covariance shifts with a Ledoit-Wolf regularized Mahalanobis distance.
Key facts
- arXiv paper 2605.18988
- Title: Surviving the Unseen: Predictive Defense for Novel Multi-Turn Multimodal Attacks
- Introduces TRIAD (Triple-tier Anomaly Defense) framework
- Addresses non-stationary attack surface in MLLMs
- Adversaries use progressive cross-modal perturbations
- Static defenses constrained by Markov property
- Formulates safety as dynamic survival prediction
- Uses Ledoit-Wolf regularized Mahalanobis distance
Entities
Institutions
- arXiv