Dual-Window Smoothing for Implicit Action Chunking in Continuous Control

other · 2026-05-20

A novel reinforcement learning framework, named Dual-Window Smoothing (DWS), tackles the issue of high-frequency oscillatory control signals that threaten safety and stability in real-world applications. Unlike traditional action chunking methods, which predict fixed-horizon trajectories and increase policy output dimensions with horizon length—resulting in optimization challenges and step-wise interaction issues—DWS maintains temporal coherence without enlarging the action space. It features a dual-window approach: the execution window guarantees physical smoothness via deterministic modulation, while the value window synchronizes temporal-difference targets across the horizon to mitigate critic bias from open-loop execution. Additionally, DWS incorporates a lightweight temporal mechanism on the actor side. This research is documented in a paper available on arXiv under ID 2605.19592.

Key facts

Dual-Window Smoothing (DWS) is proposed for smooth continuous control in reinforcement learning.
Explicit action chunking predicts fixed-horizon trajectories but scales policy output dimension with horizon length.
DWS uses an execution window for physical smoothness via deterministic modulation.
DWS uses a value window to align temporal-difference targets over the horizon.
DWS corrects critic bias caused by open-loop execution.
DWS includes a lightweight actor-side temporal mechanism.
The paper is available on arXiv with ID 2605.19592.

Dual-Window Smoothing for Implicit Action Chunking in Continuous Control

Key facts

Entities

Institutions

Sources