M2A Merges Mathematical and Agentic Reasoning in LLMs

ai-technology · 2026-05-12

A new paradigm called M2A synergizes mathematical and agentic reasoning in large language models through model merging. Mathematical reasoning relies on intrinsic logic for closed-world problems in a single response, while agentic reasoning requires multi-turn interaction with external environments. Their misalignment prevents mutual benefit and causes unstable behavior under multi-task learning. M2A operates in parameter space, identifying feature subspaces critical for agent behavior to avoid overfitting. The approach merges models to combine both reasoning types effectively.

Key facts

M2A is a novel paradigm for synergizing mathematical and agentic reasoning.
Mathematical reasoning uses intrinsic logic for closed-world problems in a single response.
Agentic reasoning requires multi-turn interaction with external environments.
The misalignment between the two reasoning types prevents effective mutual benefit.
Multi-task learning yields unstable reasoning behavior and limited performance gains.
M2A operates directly in parameter space.
It identifies the feature subspace critical for agent behavior.
Model merging avoids overfitting to superficial reasoning patterns.

M2A Merges Mathematical and Agentic Reasoning in LLMs

Key facts

Entities

Institutions

Sources