SURGE: Novel Gradient Compensation for Binary Neural Networks
A recent study presents SURrogate GradiEnt Adaptation (SURGE), an innovative framework for learnable gradient compensation aimed at enhancing the training process of Binary Neural Networks (BNNs). BNNs depend on gradient approximations for non-differentiable functions like the sign function; however, current techniques, including the Straight-Through Estimator (STE), encounter issues with gradient mismatch and information loss due to fixed-range gradient clipping. SURGE resolves these challenges through auxiliary backpropagation, utilizing a Dual-Path Gradient Compensator (DPGC) that creates a parallel full-precision auxiliary branch for each binarized layer. This method allows for decoupled gradient flow during backpropagation, resulting in bias-reduced gradient estimation. The research is available on arXiv under ID 2605.10989.
Key facts
- SURGE is a learnable gradient compensation framework for BNNs.
- It addresses gradient mismatch and information loss in existing methods.
- The Dual-Path Gradient Compensator (DPGC) creates a parallel full-precision auxiliary branch.
- DPGC decouples gradient flow via output decomposition during backpropagation.
- The paper is available on arXiv with ID 2605.10989.
- SURGE stands for SURrogate GradiEnt Adaptation.
- The method is theoretically grounded.
- It mitigates gradient mismatch through auxiliary backpropagation.
Entities
Institutions
- arXiv