ARTFEED — Contemporary Art Intelligence

Metacognitive Probe Diagnoses LLM Confidence Calibration Across Five Dimensions

ai-technology · 2026-05-12

A new diagnostic tool called the Metacognitive Probe has been developed by researchers to assess the confidence behavior of large language models across five key areas: confidence calibration, epistemic vigilance, knowledge boundary, calibration range, and reasoning-chain validation. This five-task probe, which draws inspiration from Flavell (1979) and Nelson and Narens (1990), was evaluated on eight advanced models and 69 human subjects, emphasizing observable confidence-correctness alignment. It is important to note that this tool is not a validated cross-species metacognition scale, and a pre-defined hypothesis regarding human development was disproven. Existing benchmarks like MMLU, BIG-Bench, HELM, and GPQA measure correct responses but do not indicate whether a model recognizes incorrect answers, allowing for potential overconfidence in specific areas despite an overall high score.

Key facts

  • The Metacognitive Probe is a five-task, 15-slot diagnostic.
  • It decomposes LLM confidence into five dimensions: T1-CC, T2-EV, T3-KB, T4-CR, T5-RCV.
  • Evaluated on N=8 frontier models and N=69 humans.
  • Motivated by Flavell (1979) and Nelson and Narens (1990).
  • The instrument is not a validated cross-species metacognition scale.
  • A pre-specified human developmental hypothesis was falsified.
  • Composite benchmarks (MMLU, BIG-Bench, HELM, GPQA) are silent on model's awareness of its errors.
  • A model can score 80 on a composite calibration benchmark yet be overconfident in narrow pockets.

Entities

Sources