LLM privacy reasoning improved via normative simulacra from fiction
A new arXiv preprint (2604.20904) proposes extracting normative simulacra—structured representations of norms and information flows—from fiction novels to fine-tune large language models (LLMs) for better privacy reasoning. The approach uses supervised learning followed by GRPO reinforcement learning, with a composite reward function combining programmatic signals (task clarity, structural completeness, internal consistency, context identification) and an LLM judge. This addresses the misalignment between LLM agents' information handling and users' contextual privacy expectations, as defined by the Contextual Integrity (CI) framework, without doubling inference costs or relying on narrow task-specific data.
Key facts
- arXiv preprint 2604.20904 proposes normative simulacra from fiction novels
- Method uses supervised learning and GRPO reinforcement learning
- Composite reward function includes programmatic signals and an LLM judge
- Addresses misalignment between LLM agents and contextual privacy expectations
- Based on Contextual Integrity (CI) framework
- Avoids doubling inference costs or narrow task-specific fine-tuning
Entities
Institutions
- arXiv