POLIS Framework Boosts LLM Reasoning via Cumulative Cultural Evolution

ai-technology · 2026-04-24

A new framework called POLIS (Population Orchestrated Learning and Inference Society) enables large language models to improve through interaction-driven cumulative intelligence, mimicking human cumulative cultural evolution. In POLIS, multiple LLM agents generate solutions, verify each other's outputs, and retain validated artifacts in shared cultural memory, which is then internalized via parameter updates. On mathematical reasoning benchmarks, populations of 1–4 billion parameter models achieved average gains of 8.8–18.9 points over base models, narrowing the performance gap to 70B+ parameter monolithic models. Mechanistic ablations identified peer verification as the key ratchet operator, with internalization sustaining accumulation across rounds. The research provides computational evidence that interaction-driven processes can enhance LLM capabilities without relying solely on static corpora or parameter growth.

Key facts

POLIS stands for Population Orchestrated Learning and Inference Society.
The framework uses heterogeneous agents that generate, verify, and retain solutions.
Gains of 8.8–18.9 points were achieved on math reasoning benchmarks.
Models with 1–4B parameters narrowed the gap to 70B+ parameter models.
Peer verification is the main ratchet operator in the system.
Internalization sustains accumulation across rounds.
The research is published on arXiv with ID 2507.21166.
The paper was announced as a replace-cross type.

POLIS Framework Boosts LLM Reasoning via Cumulative Cultural Evolution

Key facts

Entities

Institutions

Sources