Schema-aware Cumulative Process Reward Model for KG Question Answering

ai-technology · 2026-05-06

A new AI model, SCPRM (Schema-aware Cumulative Process Reward Model), has been proposed to improve reasoning in knowledge graph question answering. Large language models face challenges in evaluating intermediate reasoning steps due to a risk compensation effect, where incorrect steps can be offset by later correct ones, leading to high rewards for flawed paths. This issue is particularly problematic in risk-sensitive domains like medical and legal KG reasoning, where multiple paths between entities exist. SCPRM addresses this by conditioning on the reasoning prefix and incorporating schema distance between the current step and the implicit target parsed from the query, providing cumulative and future rewards. The model aims to enhance the reliability of step-wise supervision in complex reasoning tasks.

Key facts

SCPRM stands for Schema-aware Cumulative Process Reward Model.
It is designed for knowledge graph question answering.
Large language models suffer from a risk compensation effect in reasoning.
Incorrect steps can be offset by later correct steps, assigning high rewards to flawed paths.
Multiple paths in KGs exacerbate the issue.
Risk-sensitive tasks include medical and legal KG reasoning.
SCPRM conditions on the reasoning prefix.
It incorporates schema distance between current step and implicit target.
The model provides cumulative and future rewards.

Entities

—

Sources

arXiv cs.AI — 2026-05-05