Hidden-State Privacy Mechanisms Fail to Balance Utility and Security

other · 2026-05-26

A study on arXiv (2605.24042) reveals that no Gaussian release covariance among 1,536 tested achieves both moderate utility and privacy against adaptive retrieval attacks for single-layer hidden-state privacy. A Fisher-ball lower bound shows every full-rank Gaussian release with O(1) Fisher utility has a direction with linearly growing Mahalanobis signal, ruling out uniform safety. The diagonal inverse-Fisher release Σ⋆diag(K) = (2K/d) diag(1/F_ii) is minimax-optimal at first-order KL budget K and achieves worst-attacker top-1 ≤ 0.001 across 32 model layers, but sits on a privacy/utility edge. A generalized-eigen mechanism reaches 13× Pareto reduction under Euclidean retrieval but collapses to 1.

Key facts

1,536 Gaussian release covariances tested for single-layer hidden-state privacy
Zero achieve both moderate utility and moderate privacy against adaptive retrieval attacker
Fisher-ball lower bound: every full-rank Gaussian release at O(1) Fisher utility has linearly growing Mahalanobis signal
Diagonal inverse-Fisher release Σ⋆diag(K) = (2K/d) diag(1/F_ii) is unique minimax-optimal diagonal mechanism at first-order KL budget K
Worst-attacker top-1 ≤ 0.001 at every point of a 32 model-layer grid for the diagonal mechanism
Generalized-eigen mechanism reaches 13× Pareto reduction under Euclidean retrieval
Study published on arXiv with ID 2605.24042
Empirical empty middle matches theoretical bound

Hidden-State Privacy Mechanisms Fail to Balance Utility and Security

Key facts

Entities

Institutions

Sources