Double Deep Reinforcement Learning for Demand Forecasting Model Selection

ai-technology · 2026-05-07

A new research paper proposes a double deep reinforcement learning agent to automate the selection of forecasting models for supply chain demand prediction. The architecture chooses a model from a forecasting committee at prediction time, addressing the complexity of selecting appropriate solutions for datasets with distinct features. A novel early-stopping approach based on average reward convergence is introduced to reduce training time. The model was evaluated using grocery sales and snack demand datasets. The study builds on decades of research into automatic forecasting model selection, leveraging recent advances in demand forecasting.

Key facts

The research proposes a double deep reinforcement learning agent for automatic forecasting model selection.
The agent selects a model from a forecasting committee at prediction time.
A novel early-stopping approach based on average reward convergence expedites training.
Empirical evaluation used grocery sales and snack demand datasets.
The work addresses the challenge of selecting appropriate forecasting solutions for datasets with distinct features.
Research on automatic forecasting model selection has been ongoing since the 1980s.
Recent developments in demand forecasting have opened new perspectives.
The study is published on arXiv with ID 2605.04068.

Double Deep Reinforcement Learning for Demand Forecasting Model Selection

Key facts

Entities

Institutions

Sources