ARTFEED — Contemporary Art Intelligence

Differential Privacy Guarantees for RL with General Function Approximation

other · 2026-05-11

A newly proposed theoretical framework introduces the initial differential privacy assurances for online reinforcement learning utilizing general function approximation, surpassing earlier tabular and linear frameworks. This method integrates batched policy updates with the exponential mechanism alongside an innovative regret analysis, achieving a regret scaling of Õ(K^{3/5}) in a model-free context, which aligns with the leading bounds in the linear scenario. Furthermore, this research presents the first regret bound for online RL employing batch updates that is influenced by the coverability complexity measure, complementing findings based on the Eluder-Condition class. The authors also highlight significant deficiencies in recent findings regarding private RL with linear function approximation.

Key facts

  • First theoretical guarantees for differentially private online RL with general function approximation
  • Combines batched policy update scheme with exponential mechanism
  • Regret scales as Õ(K^{3/5}) in model-free setting under differential privacy
  • Matches state of the art for linear case
  • First regret bound for online RL with batch update depending on coverability
  • Uncovers gaps in recent results for private RL with linear function approximation
  • Extends beyond tabular and linear settings
  • Published on arXiv with ID 2605.07049

Entities

Institutions

  • arXiv

Sources