ARTFEED — Contemporary Art Intelligence

Stepwise Confidence Attribution for Diagnosing LLM Reasoning Failures

ai-technology · 2026-05-20

A new framework called Stepwise Confidence Attribution (SCA) diagnoses where multi-step reasoning fails in black-box large language models (LLMs) by assigning confidence to each step based solely on generated reasoning traces. SCA applies the Information Bottleneck (IB) principle: steps aligning with consensus across correct solutions get high confidence, deviations are flagged. Two methods are proposed: NIBS (non-parametric IB without graph structures) and GIBS (graph-based IB learning subgraphs via differentiable mask). The approach works for closed-source LLMs without internal access, addressing limitations of existing methods that only estimate confidence for final answers or require model internals.

Key facts

  • SCA diagnoses multi-step reasoning failures in black-box LLMs
  • Assigns step-level confidence based only on generated reasoning traces
  • Applies Information Bottleneck principle
  • Steps aligning with consensus across correct solutions receive high confidence
  • Deviations are flagged as potentially erroneous
  • Two methods: NIBS (non-parametric) and GIBS (graph-based)
  • Works for closed-source LLMs without internal model access
  • Existing methods are restricted to final answers or require internal access

Entities

Institutions

  • arXiv

Sources