Critique-Driven Reasoning Alignment for LLM Personalization

ai-technology · 2026-04-30

A new approach to aligning Large Language Models (LLMs) with user preferences, called Critique-Driven Reasoning Alignment (CDRA), reframes alignment from reward-matching to structured reasoning. It introduces the DeepPref benchmark, a dataset of 3000 preference-query pairs across 20 topics, generated by a simulated multi-faceted cognitive council that produces critique-annotated reasoning chains. The method addresses the dual challenge of inferring users' deep implicit preferences (unstated goals, semantic context, risk tolerances) and performing defensive reasoning in ambiguous real-world scenarios. Current alignment methods produce superficial and brittle responses due to this cognitive gap. The work is published on arXiv under identifier 2510.11194.

Key facts

CDRA reframes alignment as a structured reasoning process.
DeepPref benchmark contains 3000 preference-query pairs.
Pairs cover 20 topics.
Data is curated by a simulated multi-faceted cognitive council.
Council produces critique-annotated reasoning chains.
Method addresses inference of deep implicit preferences.
Method includes defensive reasoning for ambiguity.
Published on arXiv with ID 2510.11194.

Critique-Driven Reasoning Alignment for LLM Personalization

Key facts

Entities

Institutions

Sources