POLAR-Bench: Benchmarking Privacy-Utility Trade-offs in LLM Agents

ai-technology · 2026-05-20

The POLAR-Bench (Policy-aware adversarial Benchmark) has been developed as a diagnostic tool to assess the balance between privacy and utility in large language model (LLM) agents. It creates scenarios in which a reliable model, equipped with a privacy policy, engages with a third-party model that attempts to extract both relevant task information and sensitive attributes. Covering 10 domains and comprising 7,852 samples, the benchmark evaluates privacy and utility through deterministic set-membership. It manipulates privacy policy dimensions and attack strategies across two independent axes, generating a 5×5 diagnostic surface for each model. Findings indicate a significant divide: leading models retain over 99% of protected attributes, while smaller open-weight models (1–30B range) exhibit inferior performance.

Key facts

POLAR-Bench evaluates privacy-utility trade-offs in LLM agents.
It uses a trusted model with a privacy policy and a third-party adversarial model.
The benchmark covers 10 domains and 7,852 samples.
Scoring is done via deterministic set-membership.
Privacy policy dimension and attack strategy vary along two orthogonal axes.
A 5×5 diagnostic surface is produced per model.
Frontier models withhold over 99% of protected attributes.
Smaller open-weight models (1–30B) show lower performance.

POLAR-Bench: Benchmarking Privacy-Utility Trade-offs in LLM Agents

Key facts

Entities

Institutions

Sources