POLAR-Bench: Benchmarking Privacy-Utility Trade-offs in LLM Agents
The POLAR-Bench (Policy-aware adversarial Benchmark) has been developed as a diagnostic tool to assess the balance between privacy and utility in large language model (LLM) agents. It creates scenarios in which a reliable model, equipped with a privacy policy, engages with a third-party model that attempts to extract both relevant task information and sensitive attributes. Covering 10 domains and comprising 7,852 samples, the benchmark evaluates privacy and utility through deterministic set-membership. It manipulates privacy policy dimensions and attack strategies across two independent axes, generating a 5×5 diagnostic surface for each model. Findings indicate a significant divide: leading models retain over 99% of protected attributes, while smaller open-weight models (1–30B range) exhibit inferior performance.
Key facts
- POLAR-Bench evaluates privacy-utility trade-offs in LLM agents.
- It uses a trusted model with a privacy policy and a third-party adversarial model.
- The benchmark covers 10 domains and 7,852 samples.
- Scoring is done via deterministic set-membership.
- Privacy policy dimension and attack strategy vary along two orthogonal axes.
- A 5×5 diagnostic surface is produced per model.
- Frontier models withhold over 99% of protected attributes.
- Smaller open-weight models (1–30B) show lower performance.
Entities
Institutions
- arXiv