AE-CoT: Adaptive Evolutionary Jailbreak for LLMs

ai-technology · 2026-05-26

A new research paper on arXiv (2605.24497) proposes AE-CoT, an adaptive evolutionary chain-of-thought jailbreak framework targeting Large Reasoning Models (LRMs). The method rewrites harmful goals into mild prompts using teacher role-play, decomposes them into coherent reasoning fragments, and performs multi-generation evolutionary search to expand candidate diversity. This addresses the vulnerability of explicit CoT mechanisms in LRMs, which static jailbreak templates fail to exploit effectively due to limited diversity and adaptability.

Key facts

arXiv:2605.24497
AE-CoT framework
targets Large Reasoning Models (LRMs)
uses adaptive evolutionary chain-of-thought jailbreak
rewrites harmful goals into mild prompts with teacher role-play
decomposes prompts into reasoning fragments
multi-generation evolutionary search
addresses limitations of static CoT jailbreak templates

AE-CoT: Adaptive Evolutionary Jailbreak for LLMs

Key facts

Entities

Institutions

Sources