SELFCI: A Self-Distillation Framework for Privacy in LLMs

ai-technology · 2026-05-22

A new framework called SELFCI (Self-Distillation for Contextual Integrity) aims to improve privacy in large language models by decoupling information suppression from task resolution. Proposed in a paper on arXiv (2605.20258), SELFCI uses complementary self-distillation to optimize two independent reverse KL divergences: one preserves task-relevant information for utility, the other enforces minimal disclosure. This creates a Product-of-Experts target that balances privacy and performance without degrading task accuracy. The approach addresses Contextual Integrity (CI), which governs information flows according to contextual norms, a critical issue as LLMs are deployed as personal agents handling sensitive workflows.

Key facts

SELFCI stands for Self-Distillation for Contextual Integrity
It decouples information suppression from task resolution
Uses two independent reverse KL divergences
One divergence preserves task-relevant information
The other enforces minimal and appropriate disclosure
Creates a Product-of-Experts (PoE) target
Aims to overcome privacy-utility trade-off
Paper published on arXiv with ID 2605.20258

SELFCI: A Self-Distillation Framework for Privacy in LLMs

Key facts

Entities

Institutions

Sources