ARTFEED — Contemporary Art Intelligence

AI Agents Achieve Spontaneous Self-Evolution Without Human Rewards

ai-technology · 2026-04-22

A recent research paper presents a novel technique for training AI agents to evolve autonomously, eliminating the need for external rewards or human oversight. This method incorporates an intrinsic meta-evolution capability, enabling agents to learn about new environments independently before undertaking tasks. Throughout the training process, an outcome-based reward system evaluates how much an agent’s self-acquired knowledge enhances its performance on subsequent tasks. This reward mechanism equips the model with effective exploration and summarization skills. During inference, the agent functions without external incentives or human guidance, relying entirely on its internal parameters to navigate unfamiliar settings. Applied to Qwen3-30B and Seed-OSS-36B models, this approach yielded a 20% improvement in performance, marking a significant shift from traditional agent systems that rely on human-defined rewards. The research tackles the critical limitation of external supervision in modern AI agents, which typically cease to evolve without human intervention.

Key facts

  • Research trains AI agents for spontaneous self-evolution without external rewards
  • Agents develop intrinsic meta-evolution capability to learn about unseen environments
  • Outcome-based reward mechanism measures improvement in downstream task success
  • Reward signal used only during training phase to teach exploration and summarization
  • At inference time, agents require no external rewards or human instructions
  • Method applied to Qwen3-30B and Seed-OSS-36B models
  • Native evolution approach yields 20% performance improvement
  • Current agent systems typically depend on human-defined rewards and rules

Entities

Institutions

  • arXiv

Sources