Model Collapse Explained Through Cultural Evolution Theory

ai-technology · 2026-05-25

A recent study featured on arXiv (2605.23054) utilizes iterated learning theory from cultural evolution to shed light on model collapse in large language models (LLMs). Researchers formulated five testable predictions and evaluated them by self-training LLaMA-2-7B and Mistral-7B across ten generations in English, German, and Turkish. A key discovery reveals that compositionality exhibits a non-monotonic pattern—initially increasing before decreasing—during unfiltered self-training. This phenomenon remains evident even with maximally regular seed data, eliminating noise removal as a factor, and is maintained solely through task-grounded filtering rather than random filtering, marking the first LLM-scale evidence of the compression-communication tradeoff. All predictions were validated.

Key facts

Study applies iterated learning theory from cultural evolution to model collapse in LLMs.
Five falsifiable predictions were derived and tested.
Models tested: LLaMA-2-7B and Mistral-7B over 10 generations.
Languages: English, German, Turkish.
Compositionality follows a non-monotonic trajectory under unfiltered self-training.
Non-monotonic signature persists with maximally regular seed data.
Task-grounded filtering sustains the signature; random filtering does not.
First LLM-scale evidence for the compression-communication tradeoff.

Model Collapse Explained Through Cultural Evolution Theory

Key facts

Entities

Institutions

Sources