ARTFEED — Contemporary Art Intelligence

Qreg+NWLU: Novel Data Rehearsal Method for Continual Reinforcement Learning

other · 2026-05-23

A recent study published on arXiv (2605.22454) introduces Qreg+NWLU, a technique aimed at reducing catastrophic forgetting in Continual Reinforcement Learning (CRL) via value-based data rehearsal. Traditional CRL methods typically emphasize policy gradient approaches and only regularize the actors, overlooking the value function approximation. The researchers tackle this gap by exploring data rehearsal for Deep Q-Networks, utilizing Q-value regularization in environments with recurring task sequences. Qreg+NWLU features two key innovations: a continuous data rehearsal process that actively gathers and refreshes stored Q-values during training, and 'No-Wait' regularization, which takes effect immediately rather than after the initial task. The study notes that multi-cyclic environments intensify forgetting and plasticity, a significant yet underexamined real-world challenge.

Key facts

  • Paper title: Don't Forget the Critic: Value-Based Data Rehearsal for Multi-Cyclic Continual Reinforcement Learning
  • arXiv ID: 2605.22454
  • Announce type: cross
  • Proposes Qreg+NWLU method
  • Addresses catastrophic forgetting in CRL
  • Focuses on value function approximation via data rehearsal
  • Uses Deep Q-Networks with Q-value regularization
  • Introduces continuous data rehearsal and No-Wait regularization
  • Targets multi-cyclic environments with repeating task sequences

Entities

Institutions

  • arXiv

Sources