Plasticity Interventions Reduce Backdoor Threats in DRL, Except SAM

ai-technology · 2026-05-16

A new study from arXiv (2605.14587) systematically investigates how plasticity interventions—built-in components of modern deep reinforcement learning (DRL) agents—affect backdoor attack vulnerabilities. Analyzing 14,664 cases, researchers found that only the Sharpness-Aware Minimization (SAM) intervention exacerbates backdoor threats due to backdoor gradient amplification. All other interventions mitigate threats by disrupting activation pathways and compressing representation space. The work highlights a critical gap in prior research, which focused on vanilla scenarios without plasticity interventions, posing risks in practical DRL deployments.

Key facts

arXiv paper 2605.14587 investigates plasticity interventions in DRL backdoor attacks.
14,664 cases were empirically studied combining interventions and attack scenarios.
Only SAM intervention exacerbates backdoor threats.
Other interventions mitigate backdoor threats.
Exacerbation is attributed to backdoor gradient amplification.
Mitigation stems from activation pathway disruption and representation space compression.
Plasticity interventions are built-in components of modern DRL agents.
Prior studies focused on vanilla scenarios without plasticity interventions.

Plasticity Interventions Reduce Backdoor Threats in DRL, Except SAM

Key facts

Entities

Institutions

Sources