ARTFEED — Contemporary Art Intelligence

Comic-Based Jailbreaks Threaten Multimodal AI Safety

ai-technology · 2026-04-25

A recent study published on arXiv indicates that comic-style visual narratives can effectively circumvent safety measures in multimodal large language models (MLLMs). The researchers introduced ComicJailbreak, which consists of 1,167 attack scenarios categorized into 10 harm types and 5 task configurations, integrating harmful objectives within basic three-panel comics. When evaluating 15 advanced MLLMs (comprising 6 commercial and 9 open-source), the comic-based assaults demonstrated success rates akin to robust rule-based jailbreaks, with ensemble success rates surpassing 90% for several commercial models. Although current defense strategies were effective against these harmful comics, they resulted in performance compromises. This study underscores a new safety vulnerability in MLLMs when faced with visually grounded commands.

Key facts

  • ComicJailbreak benchmark includes 1,167 attack instances
  • Covers 10 harm categories and 5 task setups
  • Tested on 15 state-of-the-art MLLMs (6 commercial, 9 open-source)
  • Ensemble success rates exceeded 90% on several commercial models
  • Comic-based attacks match strong rule-based jailbreaks
  • Outperform plain-text and random-image baselines
  • Existing defenses effective but induce trade-offs
  • Study published on arXiv (2603.21697)

Entities

Institutions

  • arXiv
  • JailbreakBench
  • JailbreakV

Sources