Power-law data distribution boosts AI compositional reasoning

ai-technology · 2026-04-29

A recent study published on arXiv indicates that training AI models using natural language data that follows a power-law distribution—characterized by infrequent occurrences of most knowledge—yields better results than training with a uniform distribution for tasks involving compositional reasoning, such as state tracking and multi-step arithmetic. The authors present a minimalist skill-composition task to illustrate that power-law sampling requires significantly less training data. Their theoretical findings suggest that this distribution creates a beneficial asymmetry in the loss landscape, allowing models to initially grasp high-frequency skill compositions with lower data complexity, which subsequently facilitates the learning of less common skills. The paper, titled "The Power of Power Law: Asymmetry Enables Compositional Reasoning," was submitted to arXiv on April 26, 2025.

Key facts

arXiv paper ID: 2604.22951
Published: April 26, 2025
Power-law distribution outperforms uniform distribution for compositional reasoning
Tasks tested: state tracking, multi-step arithmetic
Power-law sampling requires less training data
Beneficial asymmetry in loss landscape is key mechanism
Minimalist skill-composition task used for theoretical proof
Natural language data follows power-law distribution

Power-law data distribution boosts AI compositional reasoning

Key facts

Entities

Institutions

Sources