ARTFEED — Contemporary Art Intelligence

Plasticity Interventions Reduce Backdoor Threats in DRL, Except SAM

ai-technology · 2026-05-16

A new study from arXiv (2605.14587) systematically investigates how plasticity interventions—built-in components of modern deep reinforcement learning (DRL) agents—affect backdoor attack vulnerabilities. Analyzing 14,664 cases, researchers found that only the Sharpness-Aware Minimization (SAM) intervention exacerbates backdoor threats due to backdoor gradient amplification. All other interventions mitigate threats by disrupting activation pathways and compressing representation space. The work highlights a critical gap in prior research, which focused on vanilla scenarios without plasticity interventions, posing risks in practical DRL deployments.

Key facts

  • arXiv paper 2605.14587 investigates plasticity interventions in DRL backdoor attacks.
  • 14,664 cases were empirically studied combining interventions and attack scenarios.
  • Only SAM intervention exacerbates backdoor threats.
  • Other interventions mitigate backdoor threats.
  • Exacerbation is attributed to backdoor gradient amplification.
  • Mitigation stems from activation pathway disruption and representation space compression.
  • Plasticity interventions are built-in components of modern DRL agents.
  • Prior studies focused on vanilla scenarios without plasticity interventions.

Entities

Institutions

  • arXiv

Sources