ARTFEED — Contemporary Art Intelligence

Metacognition-as-Reward Framework Enhances LLM Reasoning

ai-technology · 2026-05-25

A new reinforcement learning framework called Metacognition-as-Reward (MaR) improves reasoning in large language models by incorporating metacognitive knowledge and regulation signals. MaR addresses limitations of existing reward paradigms: RLVR relies on outcome signals from executable checks or ground-truth answers, offering limited intermediate guidance; RaR uses natural-language rubrics but requires instance-specific design. MaR introduces two general process dimensions—metacognitive knowledge for identifying task-relevant information without hand-crafted rubrics, and metacognitive regulation for planning and adjusting reasoning—to provide reward guidance. The framework is detailed in arXiv paper 2605.23384.

Key facts

  • MaR stands for Metacognition-as-Reward
  • Framework is metacognition-inspired
  • Addresses RLVR and RaR limitations
  • Two dimensions: metacognitive knowledge and metacognitive regulation
  • Metacognitive knowledge identifies task-relevant information without instance-specific rubrics
  • Metacognitive regulation plans and adjusts reasoning process
  • Paper available on arXiv with ID 2605.23384
  • Announce type is cross

Entities

Institutions

  • arXiv

Sources