Geometric Monomial (GEM): A Family of Smooth Activation Functions for Deep Neural Networks
A new set of activation functions called Geometric Monomial (GEM) has been launched for deep neural networks. GEM includes three variants: the standard GEM, E-GEM, which adds an ε-parameter for better L^p-approximation of ReLU, and SE-GEM, a piecewise version that deals with dead neurons using C^{2N} smoothness. These functions offer smoothness and rational arithmetic, performing similarly to ReLU. An N-ablation study showed that setting N to 1 is best for standard depth networks, reducing the GELU gap on CIFAR-100 with ResNet-56 from 6.10% to 2.12%. The smoothness parameter N also suggests a balance between CNNs and transformers. You can check out this research on arXiv with ID 2604.21677.
Key facts
- GEM stands for Geometric Monomial.
- The activation functions are C^{2N}-smooth.
- Three variants: GEM, E-GEM, SE-GEM.
- E-GEM allows arbitrary L^p-approximation of ReLU.
- SE-GEM eliminates dead neurons.
- N=1 is optimal for standard-depth networks.
- GELU deficit reduced from 6.10% to 2.12% on CIFAR-100 + ResNet-56.
- arXiv ID: 2604.21677.
Entities
—