MiRD Framework for Reliable Set-Valued Prediction in Open-Ended QA

ai-technology · 2026-05-27

MiRD is a dual-phase framework designed for dependable set-valued predictions, tackling hallucinations in open-ended question answering by breaking down overall miscoverage into two components: sampling failure and conditional selection failure. In the first stage, it sets an expectation-level marginal upper limit on the likelihood that finite sampling yields no acceptable answer within a predetermined budget. The second stage fine-tunes a conformal selection threshold utilizing admission-correlated nonconformity scores from the complete calibration set, ensuring the integrity of the calibration set. The framework underwent testing on three open-ended QA datasets and eight models, successfully managing sampling risk.

Key facts

MiRD decomposes miscoverage into sampling failure and conditional selection failure.
Stage I provides an expectation-level marginal upper bound on sampling failure probability.
Stage II calibrates a conformal selection threshold using admission-correlated nonconformity scores.
The framework preserves calibration-set integrity by using the full calibration set.
Tested on three open-ended QA datasets and eight models.
MiRD controls sampling risk in set-valued prediction.
The approach mitigates hallucinations in open-ended QA.
The paper is published on arXiv with ID 2605.27091.

MiRD Framework for Reliable Set-Valued Prediction in Open-Ended QA

Key facts

Entities

Institutions

Sources