SkillOS: Reinforcement Learning for Self-Evolving AI Agents

ai-technology · 2026-05-09

Researchers propose SkillOS, a reinforcement learning framework enabling LLM-based agents to autonomously curate reusable skills from past interactions. Current agents fail to learn from experience, relying on manual or heuristic skill curation. SkillOS pairs a frozen executor with a trainable curator that updates an external SkillRepo using composite rewards from delayed feedback. The approach addresses long-term skill curation policies, a key bottleneck in self-evolving agents. The paper is available on arXiv under ID 2605.06614.

Key facts

SkillOS is an RL training recipe for learning skill curation in self-evolving agents.
It pairs a frozen agent executor with a trainable skill curator.
The curator updates an external SkillRepo from accumulated experience.
Composite rewards provide learning signals for curation.
Existing approaches rely on manual curation or heuristic operations.
SkillOS tackles complex long-term curation policies from indirect feedback.
The paper is published on arXiv with ID 2605.06614.
LLM-based agents currently fail to learn from past interactions.

SkillOS: Reinforcement Learning for Self-Evolving AI Agents

Key facts

Entities

Institutions

Sources