ARTFEED — Contemporary Art Intelligence

Controllable User Simulation Formalized as Causal Inference Problem

other · 2026-05-13

A new arXiv paper (2605.11519) formalizes controllable user simulation for conversational AI evaluation as a causal inference problem. The authors argue that standard supervised fine-tuning on post-hoc trajectory labels introduces a look-ahead bias, breaking causal consistency. They prove that under policy shift, this bias causes evaluation metric variance to explode geometrically, a phenomenon termed "controllability collapse." The work bridges natural language evaluation with off-policy evaluation methodology, aiming to improve targeted counterfactual testing of conversational agents.

Key facts

  • arXiv paper number: 2605.11519
  • Announce type: new
  • Formalizes controllable simulation as causal inference
  • Standard SFT on post-hoc labels causes look-ahead bias
  • Bias breaks causal consistency
  • Under policy shift, variance of evaluation metrics explodes geometrically
  • Phenomenon named controllability collapse
  • Bridges natural language evaluation with off-policy evaluation

Entities

Institutions

  • arXiv

Sources