Contrastive Learning Improves Feature Attribution in Neural Networks
A new study on arXiv (2604.22540) empirically demonstrates that neural networks trained with Supervised Contrastive Learning (SCL) produce higher-quality feature attribution explanations compared to those trained with cross-entropy. SCL creates an embedding space where similar data points cluster together, offering advantages in adversarial robustness and out-of-distribution detection, making it preferable for safety-critical applications. The research focuses on image classification tasks.
Key facts
- arXiv paper 2604.22540
- Supervised Contrastive Learning (SCL) vs Cross-Entropy (CE)
- SCL improves feature attribution quality
- SCL enhances adversarial robustness
- SCL improves out-of-distribution detection
- Study focuses on image classification
- SCL uses labels as similarity criteria
- SCL creates clustered embedding space
Entities
Institutions
- arXiv