Survey on Mathematical Reasoning in Large Language Models

publication · 2026-05-20

A recent survey published on arXiv offers a comprehensive examination of mathematical reasoning within Large Language Models (LLMs). This research evaluates around 120 peer-reviewed articles and preprints, focusing on aspects such as datasets, architectures, training methodologies, and assessment protocols. It proposes a cohesive classification of mathematical datasets, differentiating among pretraining corpora, supervised fine-tuning materials, and evaluation benchmarks. The findings establish a systematic framework to evaluate existing advancements and shortcomings in LLM mathematical reasoning, emphasizing its significance for education, scientific research, and industry applications.

Key facts

arXiv:2605.19723v1
Survey covers approximately 120 peer-reviewed studies and preprints
Introduces unified taxonomy of mathematical datasets
Distinguishes pretraining corpora, supervised fine-tuning resources, and evaluation benchmarks
Focuses on LLMs and their mathematical reasoning capabilities
Mathematical reasoning is a benchmark for AI systems
Structured analysis of datasets, architectures, training strategies, and evaluation protocols
Provides a unified analytical framework

Survey on Mathematical Reasoning in Large Language Models

Key facts

Entities

Institutions

Sources