Goal-Space Planning Improves RL for Demand Response Scheduling

other · 2026-05-16

A team of researchers has combined Goal-Space Planning (GSP) with Deep Deterministic Policy Gradient (DDPG) to tackle terminal constraints in data-driven demand response scheduling for electrified chemical processes. This innovative method employs learned temporally abstract models across discrete subgoals to enhance value propagation over extended horizons, addressing the credit-assignment difficulties encountered in traditional reinforcement learning. In a simulation of an air separation benchmark, the technique demonstrated improved sample efficiency, met terminal storage requirements, and minimized myopic control actions. This research is available on arXiv (2605.14741) in the Electrical Engineering and Systems Science > Systems and Control category.

Key facts

Goal-Space Planning (GSP) integrated with Deep Deterministic Policy Gradient (DDPG)
Addresses terminal constraints in demand response scheduling
Uses learned temporally abstract models over discrete subgoals
Applied to simulated air separation benchmark
Improves sample efficiency over standard DDPG
Satisfies terminal storage constraints
Mitigates myopic control behavior
Published on arXiv (2605.14741)

Goal-Space Planning Improves RL for Demand Response Scheduling

Key facts

Entities

Institutions

Sources