SAGE-32B AI Model Advances Agentic Reasoning Through Iterative Distillation

ai-technology · 2026-04-22

SAGE-32B represents a 32-billion parameter language model specifically engineered for agentic reasoning and long-range planning tasks. Unlike conversational AI systems, this model operates within an agentic loop that prioritizes task decomposition, tool utilization, and error correction mechanisms. It was developed by initializing from the Qwen2.5-32B pretrained model and then fine-tuned using a two-stage process called Iterative Distillation, which enhances reasoning capabilities through systematic feedback loops. A distinctive feature is its inverse reasoning approach, employing a meta-cognition component to anticipate potential failures before execution. Performance evaluations on benchmarks such as MMLU-Pro, AgentBench, and MATH-500 show SAGE-32B achieves superior success rates in multi-tool scenarios compared to baseline models of similar scale, while maintaining competitiveness on standard reasoning assessments. The model's architecture emphasizes practical application in complex, sequential decision-making environments.

Key facts

SAGE-32B is a 32 billion parameter language model
It focuses on agentic reasoning and long-range planning tasks
The model is initialized from Qwen2.5-32B pretrained model
It uses Iterative Distillation for fine-tuning
It introduces an inverse reasoning approach with meta-cognition
It performs well on MMLU-Pro, AgentBench, and MATH-500 benchmarks
It achieves higher success rates in multi-tool usage scenarios
It remains competitive on standard reasoning evaluations

Entities

—

Sources

arXiv cs.AI — 2026-04-22