CODE: Causal Editing Reduces LLM Self-Refutation from 95.6% to 6.6%
A new paper on arXiv (2605.28303) proposes CODE (Causal On-policy Distillation for Editing), a method that shifts Knowledge Editing from static fact overwriting to causal editing. The authors identify a pathology called Epistemic Dissonance, where legacy priors force LLMs to negate injected updates, causing a 95.6% self-refutation rate under zero-distortion conditions. By grounding updates in explicit causal narratives, the conflict rate drops to 6.6%. CODE couples causal bootstrapping with asymmetric on-policy distillation to internalize this evolution.
Key facts
- Paper ID: arXiv:2605.28303
- Static Fact Overwriting paradigm treats LLMs as discrete databases
- Epistemic Dissonance is a pathology from fractured pre-trained logical topologies
- Zero-distortion proxy yields 95.6% self-refutation rate
- Causal narratives reduce conflict rate to 6.6%
- CODE stands for Causal On-policy Distillation for Editing
- Method uses causal bootstrapping and asymmetric on-policy distillation
- Paper advocates for paradigm shift toward Causal Editing
Entities
Institutions
- arXiv