ARTFEED — Contemporary Art Intelligence

DecomposeRL: RL-Based Claim Verification with Traceable Reasoning

ai-technology · 2026-05-28

DecomposeRL introduces an innovative system for verifying claims, merging the precision of end-to-end classifiers with the transparency of decomposition-based techniques. It conceptualizes claim decomposition as a reinforcement learning policy, utilizing GRPO and a diverse reward ensemble, which facilitates both fully supervised and semi-supervised learning from claims without labels. To mitigate the high training expenses associated with GRPO, DecomposeRL implements a data-curation funnel that refines 115K fact-verification claims into a streamlined set of 5K claims. A DecomposeRL-7B policy, trained under full supervision on approximately 5K curated claims, achieves balanced accuracy scores of 86.3 in-domain and 69.8 out-of-domain across 11 benchmarks in biomedical, political, scientific, and general domains.

Key facts

  • DecomposeRL frames decomposition as an RL policy trained with GRPO
  • Uses a multi-faceted reward ensemble
  • Enables semi-supervised learning from unlabeled claims
  • Data-curation funnel distills 115K claims to 5K
  • DecomposeRL-7B achieves 86.3 in-domain balanced accuracy
  • Achieves 69.8 out-of-domain balanced accuracy
  • Tested on 11 claim-verification benchmarks
  • Covers biomedical, political, scientific, and general-domain claims

Entities

Institutions

  • arXiv

Sources