Unifying Goal-Conditioned RL and Unsupervised Skill Learning via Control-Maximization

publication · 2026-05-09

A new theoretical paper on arXiv (2605.06145) addresses the gap between goal-conditioned reinforcement learning (GCRL) and mutual information skill learning (MISL). The authors identify three canonical GCRL formulations and prove they are fundamentally inequivalent, potentially inducing incompatible optimal policies in the same environment. They propose a unified framework called control maximization that treats both GCRL and MISL as instances of the same principle. The paper aims to explain why skills learned through unsupervised MISL can support downstream goal-reaching, a phenomenon that lacked theoretical grounding. The work is purely theoretical and does not include experimental validation.

Key facts

arXiv paper 2605.06145
Published on arXiv
Cross-type announcement
Unifies GCRL and MISL under control maximization
Identifies three canonical GCRL formulations
Proves formulations are fundamentally inequivalent
No experimental results included
Addresses theoretical foundations of unsupervised pretraining in GCRL

Unifying Goal-Conditioned RL and Unsupervised Skill Learning via Control-Maximization

Key facts

Entities

Institutions

Sources