Activation Functions Key to Sustaining Plasticity in Continual Learning

other · 2026-05-01

A recent study published on arXiv (2509.22562) examines the impact of activation functions on plasticity loss in continual learning, which presents a challenge that extends beyond catastrophic forgetting, where models struggle to maintain adaptability. The findings indicate that the selection of activation functions serves as a key, architecture-independent factor in alleviating this issue. The authors analyze negative-branch shape and saturation effects, proposing two new drop-in nonlinearities: Smooth-Leaky and Randomized Smooth-Leaky. These nonlinearities are tested in supervised class-incremental benchmarks and reinforcement learning within dynamic MuJoCo environments. The research emphasizes that while differences in activation functions diminish with tuning in i.i.d. settings, their significance remains crucial and largely overlooked in continual learning scenarios.

Key facts

Study from arXiv:2509.22562
Focuses on plasticity loss in continual learning
Activation choice is architecture-agnostic lever
Introduces Smooth-Leaky and Randomized Smooth-Leaky
Evaluated in supervised class-incremental and RL MuJoCo
Contrasts with i.i.d. regimes where differences shrink
Addresses loss of adaptability beyond catastrophic forgetting
Based on negative-branch shape and saturation analysis

Activation Functions Key to Sustaining Plasticity in Continual Learning

Key facts

Entities

Institutions

Sources