Contrastive Learning Improves Feature Attribution in Neural Networks

ai-technology · 2026-04-27

A new study on arXiv (2604.22540) empirically demonstrates that neural networks trained with Supervised Contrastive Learning (SCL) produce higher-quality feature attribution explanations compared to those trained with cross-entropy. SCL creates an embedding space where similar data points cluster together, offering advantages in adversarial robustness and out-of-distribution detection, making it preferable for safety-critical applications. The research focuses on image classification tasks.

Key facts

arXiv paper 2604.22540
Supervised Contrastive Learning (SCL) vs Cross-Entropy (CE)
SCL improves feature attribution quality
SCL enhances adversarial robustness
SCL improves out-of-distribution detection
Study focuses on image classification
SCL uses labels as similarity criteria
SCL creates clustered embedding space

Contrastive Learning Improves Feature Attribution in Neural Networks

Key facts

Entities

Institutions

Sources