Federated Actor-Critic Achieves Personalized Policy Training

other · 2026-05-16

A recent study published on arXiv introduces a federated actor-critic framework designed for collaborative and personalized policy training. This approach enables agents to utilize a shared linear subspace representation while preserving individualized local policy elements. The authors demonstrate finite-time convergence under single-timescale updates with Markovian sampling, revealing that the critic error converges at a rate of O~(1/((1-γ)^4√(TK))) and the policy gradient norm at O~(1/((1-γ)^6√(TK))). This research tackles the challenges of environmental heterogeneity and personalization, aspects frequently neglected by prior studies.

Key facts

arXiv:2605.14423v1
Announce Type: cross
Federated actor-critic framework
Agents share common linear subspace representation
Personalized local policy components
Single-timescale updates with Markovian sampling
Critic error convergence rate: O~(1/((1-γ)^4√(TK)))
Policy gradient norm convergence rate: O~(1/((1-γ)^6√(TK)))

Federated Actor-Critic Achieves Personalized Policy Training

Key facts

Entities

Institutions

Sources