Weight Concentration Regularizer Boosts Pruning Robustness in Neural Networks

other · 2026-05-18

A novel regularizer designed for training, known as Weight Concentration Regularizer (WCR), enhances the resilience of deep neural networks during one-shot pruning in scenarios of high sparsity. Although deep neural networks perform exceptionally well in vision and language applications, their extensive parameter counts pose challenges for deployment in environments with limited resources. One-shot pruning allows for a reduction in model size without the need for retraining; however, traditional training methods can result in notable accuracy declines when faced with high sparsity. Previous techniques have included regularizers like ℓ1 and DeepHoyer, which modify weight distributions, as well as pruning-robust optimizers such as SAM, CrAM, and S²SAM that aim to smooth the loss landscape. Nonetheless, current regularizers either uniformly reduce all weights (ℓ1) or create scale-invariant sparsity (DeepHoyer), which does not effectively concentrate weight energy on a select group of significant parameters. WCR resolves this issue by increasing the magnitude of a limited number of parameters while pushing others toward zero. The research can be found on arXiv with the identifier 2511.14282.

Key facts

Weight Concentration Regularizer (WCR) is proposed for improving pruning robustness.
WCR is a training-time regularizer.
One-shot pruning reduces model size without retraining.
Standard training often causes accuracy drops under aggressive sparsity.
Prior regularizers include ℓ1 and DeepHoyer.
Prior pruning-robust optimizers include SAM, CrAM, and S²SAM.
Existing regularizers do not concentrate weight energy onto informative parameters.
WCR amplifies a small subset of parameters while driving others toward zero.

Weight Concentration Regularizer Boosts Pruning Robustness in Neural Networks

Key facts

Entities

Institutions

Sources