ARTFEED — Contemporary Art Intelligence

CLORE: A Framework for Efficient LLM Reasoning via Content-Level Optimization

ai-technology · 2026-05-23

Researchers propose CLORE, a content-level optimization framework to improve reasoning efficiency in large language models. Reinforcement learning post-training often produces long, repetitive, or opaque reasoning traces. CLORE edits correct on-policy rollouts by deleting repetitive, illegible, or task-irrelevant content while preserving the final answer. It uses an external augmentation model and optimizes augmented-original pairs with a reference-free DPO objective alongside standard policy-gradient training. The method restricts augmentation to correct trajectories and performs local deletion, keeping edited outputs concise. The paper is available on arXiv under ID 2605.22211.

Key facts

  • CLORE stands for Content-Level Optimization for Reasoning Efficiency
  • arXiv ID: 2605.22211
  • Announce type: new
  • Addresses unnecessarily long, repetitive, or semantically opaque reasoning traces from RL post-training
  • Uses an external augmentation model to delete repetitive segments, illegible or task-irrelevant content, and superfluous reasoning
  • Preserves the final answer
  • Optimizes augmented-original pairs with an auxiliary reference-free DPO objective
  • Restricts augmentation to correct trajectories and performs local deletion

Entities

Institutions

  • arXiv

Sources