ARTFEED — Contemporary Art Intelligence

CoSPlay: Cooperative Self-Play for Test-Time Code Generation

other · 2026-05-25

There's a new framework called CoSPlay, which is explained in the arXiv paper 2605.23491. It addresses the issue of needing ground-truth unit tests for code generation in large language models (LLMs). Existing techniques, like reinforcement learning with verifiable rewards (RLVR) and test-time scaling (TTS), rely on costly ground-truth tests for proper training, which can be a disadvantage. CoSPlay changes the game by removing the need for these tests altogether, using cooperative self-play to enhance both the code and the unit tests. It starts by generating different solution concepts and spotting potential failure modes to devise solid unit test ideas. Then it uses bidirectional pass-count signals to refine the code and tests, reducing noise and false connections in self-generated tests.

Key facts

  • CoSPlay is a ground-truth-free, training-free framework for LLM code generation
  • It jointly improves code and unit tests through cooperative self-play
  • It explores diverse solution ideas and identifies potential failure modes
  • It uses bidirectional pass-count signals for refinement
  • The paper is arXiv:2605.23491
  • It addresses the bottleneck of ground-truth unit tests in RLVR and TTS methods
  • Self-generated unit tests are often noisy or spuriously coupled with wrong code
  • CoSPlay enables effective test-time scaling without ground-truth tests

Entities

Institutions

  • arXiv

Sources