CODE: Causal Editing Reduces LLM Self-Refutation from 95.6% to 6.6%

publication · 2026-05-28

A new paper on arXiv (2605.28303) proposes CODE (Causal On-policy Distillation for Editing), a method that shifts Knowledge Editing from static fact overwriting to causal editing. The authors identify a pathology called Epistemic Dissonance, where legacy priors force LLMs to negate injected updates, causing a 95.6% self-refutation rate under zero-distortion conditions. By grounding updates in explicit causal narratives, the conflict rate drops to 6.6%. CODE couples causal bootstrapping with asymmetric on-policy distillation to internalize this evolution.

Key facts

Paper ID: arXiv:2605.28303
Static Fact Overwriting paradigm treats LLMs as discrete databases
Epistemic Dissonance is a pathology from fractured pre-trained logical topologies
Zero-distortion proxy yields 95.6% self-refutation rate
Causal narratives reduce conflict rate to 6.6%
CODE stands for Causal On-policy Distillation for Editing
Method uses causal bootstrapping and asymmetric on-policy distillation
Paper advocates for paradigm shift toward Causal Editing

CODE: Causal Editing Reduces LLM Self-Refutation from 95.6% to 6.6%

Key facts

Entities

Institutions

Sources