ARTFEED — Contemporary Art Intelligence

DeFacto: Counterfactual Reasoning Framework for Multimodal AI

ai-technology · 2026-05-22

Researchers have introduced DeFacto, a framework for counterfactual reasoning aimed at enhancing the consistency of evidence and answers in multimodal language models (MLLMs). This framework combines three training approaches: positive, counterfactual, and random-masking. An automated, language-driven evidence construction pipeline identifies regions relevant to questions and creates counterfactual variants, leading to the development of the DeFacto-100K dataset. MLLMs are trained through GRPO-based reinforcement learning, utilizing three complementary rewards designed to encourage accurate responses, structured reasoning, and reliable evidence selection. This research tackles a significant drawback in existing MLLMs, where accurate answers may depend on flawed visual evidence.

Key facts

  • DeFacto is a counterfactual reasoning framework for multimodal AI.
  • It aims to enforce evidence-answer consistency in MLLMs.
  • Three training paradigms: positive, counterfactual, random-masking.
  • Language-guided pipeline creates DeFacto-100K dataset.
  • GRPO-based reinforcement learning with three rewards.
  • Published on arXiv (2509.20912) as a replace announcement.
  • Addresses failure of existing methods to ensure evidence-answer alignment.

Entities

Institutions

  • arXiv

Sources