Data Attribution Vulnerable to Manipulation in Distributed ML

ai-technology · 2026-05-18

A recent investigation has uncovered vulnerabilities in data attribution within distributed machine learning, indicating it can be easily exploited. Researchers demonstrate that an individual participant in a typical distributed training setup can artificially boost their attribution score without compromising overall performance. This method, termed attribution-first, employs latent optimization to introduce minor synthetic batches that take advantage of non-IID label distribution and evaluator sensitivities. The attack reliably enhances the adversary's attribution score across various datasets, models, and marginal-utility evaluators, while also altering the attribution dynamics among legitimate clients. Notably, it does not impair accuracy or activate geometry-based defenses. These results suggest that attribution itself represents a novel attack vector, raising concerns for pricing, auditing, and governance in ML systems. The research is available on arXiv under ID 2605.15520.

Key facts

Data attribution in distributed ML can be manipulated by a single participant.
The attribution-first attack uses latent optimization to inject synthetic batches.
The attack exploits non-IID label coverage and evaluator sensitivities.
It increases the adversary's attribution value without degrading accuracy.
The attack reshapes relative attribution among benign clients.
It does not trigger geometry-based defenses.
The study shows attribution forms a new attack surface.
The paper is on arXiv with ID 2605.15520.

Data Attribution Vulnerable to Manipulation in Distributed ML

Key facts

Entities

Institutions

Sources