Dataset Poisoning Watermarking for Contrastive Learning
A recent study published on arXiv (2605.01834) thoroughly examines data-poisoning backdoor attacks in contrastive learning (CL), uncovering several shortcomings including inadequate adaptability to datasets, low effectiveness, restricted portability, and assumptions that require knowledge of downstream tasks. The researchers utilize the statistical divergence of trigger samples from clean samples as a watermark to safeguard dataset intellectual property, enhancing success rates through statistical verification using a unified density metric. They introduce a multi-level watermarking approach that adjusts to feature-level representations.
Key facts
- arXiv:2605.01834v1
- Contrastive learning reduces annotation cost via auto-derived supervisory signals
- Large-scale in-house CL datasets are infeasible
- CL models are vulnerable to data-poisoning backdoor attacks
- Limitations: poor dataset adaptability, low success rates, limited portability, restrictive assumptions
- Trigger samples exhibit distinguishable statistical divergence from clean samples
- Repurposed as watermark for dataset IP protection
- Multi-level watermarking scheme proposed
Entities
Institutions
- arXiv