ARTFEED — Contemporary Art Intelligence

DDC Framework Balances Budget and Quality in LLM Inference Scaling

ai-technology · 2026-05-16

A new research paper introduces Dual-Dimensional Consistency (DDC), a unified framework for adaptive inference-time scaling in Large Language Models (LLMs). Current methods treat sampling width and depth as separate objectives, leading to inefficiencies: width consensus can reinforce hallucinations, while depth pruning may cut off valid reasoning chains. DDC couples a Confidence-Weighted Bayesian protocol with Trend-Aware Stratified Pruning to concentrate computational resources on high-quality paths, filtering hallucinations and accelerating consensus. Evaluations across five benchmarks show reduced token consumption while maintaining reasoning quality. The paper is available on arXiv under ID 2605.15100.

Key facts

  • DDC is a unified framework for adaptive inference-time scaling.
  • Current methods treat sampling width and depth as orthogonal objectives.
  • Width consensus risks reinforcing hallucinations.
  • Depth pruning prematurely truncates complex valid reasoning chains.
  • DDC uses Confidence-Weighted Bayesian protocol and Trend-Aware Stratified Pruning.
  • Evaluated across five benchmarks.
  • Approach reduces token consumption.
  • Paper available on arXiv: 2605.15100.

Entities

Institutions

  • arXiv

Sources