ARTFEED — Contemporary Art Intelligence

COSPLAY: Co-Evolving LLM Decision and Skill Bank Agents for Long-Horizon Tasks

ai-technology · 2026-04-25

A new framework called COSPLAY enables large language models (LLMs) to improve long-horizon decision making in interactive environments like games. The system pairs an LLM decision agent with a learnable skill bank that stores reusable skills discovered from the agent's own unlabeled rollouts. By co-evolving both components, the decision agent learns better skill retrieval and action selection over time, addressing a key weakness of LLMs in multi-step reasoning under delayed rewards and partial observability. The research is published on arXiv (2604.20987).

Key facts

  • COSPLAY is a co-evolution framework for LLM agents in long-horizon tasks.
  • It consists of an LLM decision agent and a learnable skill bank.
  • Skills are discovered from unlabeled agent rollouts.
  • The framework improves skill retrieval and action selection.
  • It addresses LLM struggles with consistent long-horizon decision making.
  • Games serve as testbeds for evaluating skill usage.
  • The paper is on arXiv with ID 2604.20987.
  • The approach handles delayed rewards and partial observability.

Entities

Institutions

  • arXiv

Sources