Training Language Agents to Learn from Experience

ai-technology · 2026-05-22

A recent study presents In-context Training (ICT), a framework designed to assess self-improvement across various tasks in language agents. The researchers suggest a reinforcement learning (RL) training pipeline that allows a reflector model to create system prompts based on observed trajectories, enhancing the actor model's effectiveness on unfamiliar tasks without requiring human examples. In experiments conducted on ALFWorld and MiniHack, the trained reflectors surpassed untrained baselines in the majority of held-out task families, showcasing that the capacity to learn from experience can indeed be acquired.

Key facts

The paper introduces the In-context Training (ICT) task.
ICT evaluates cross-task self-improvement in language agents.
A reflector model observes trajectories from an actor model.
The reflector generates system prompts to improve future performance.
An RL-based training pipeline is used without human examples.
Tests were conducted on ALFWorld and MiniHack environments.
Trained reflectors outperformed untrained baselines on most held-out task families.
The paper is available on arXiv with ID 2605.20477.

Training Language Agents to Learn from Experience

Key facts

Entities

Institutions

Sources