LLQR+SAM: Geometry-Aware Sharpness Minimization for Neural Networks
A novel optimization technique called LLQR+SAM has been developed by researchers, enhancing sharpness-aware minimization (SAM) through the integration of learned loss geometry. While SAM boosts generalization by adjusting parameters in high-curvature directions, it does so uniformly across all directions. LLQR+SAM merges SAM with a preconditioner derived from the LLQR framework, which reformulates steepest descent into a layerwise linear-quadratic regulator challenge. This preconditioner is updated gradually using a slow exponential moving average, providing a smoothed, low-resolution view of the loss landscape. Consequently, the SAM perturbation functions on this learned geometry at an accelerated pace. Theoretically, this dual-timescale approach strengthens SAM's ability to escape in directions that are locally sharp but flat in the average geometry, akin to maneuvering around potholes, thus enhancing generalization by adapting to the local curvature of the loss surface.
Key facts
- LLQR+SAM combines SAM with a learned preconditioner from the LLQR framework.
- The preconditioner is updated sparsely as a slow exponential moving average.
- SAM perturbation operates on top of the learned geometry at a faster timescale.
- The two-timescale structure amplifies escape signal in locally sharp directions.
- The method is described in arXiv paper 2605.16134.
- SAM treats all parameter directions uniformly, ignoring loss geometry.
- LLQR is a second-order method recasting steepest descent as a layerwise LQR problem.
- The approach is compared to navigating potholes with geometry awareness.
Entities
Institutions
- arXiv