Layerwise LQR: A New Framework for Geometry-Aware Deep Learning Optimization
A recent study presents Layerwise LQR (LLQR), a novel framework designed for learning structured inverse preconditioners in deep learning, focusing on a global layerwise optimal-control goal. This technique reveals a precise equivalence between steepest-descent steps in divergence-induced quadratic models—such as Newton, Gauss-Newton, Fisher/natural-gradient, and intermediate-layer metrics—and a finite-horizon Linear Quadratic Regulator (LQR) scenario. By doing so, it highlights layerwise dynamics and cost matrices that reflect the original dense geometry, providing a basis for scalable relaxation. The method seeks to enhance conditioning in deep learning while maintaining cross-layer interactions, which are frequently overlooked by scalable methods like K-FAC and Shampoo. The paper can be found on arXiv under ID 2605.04230.
Key facts
- Paper introduces Layerwise LQR (LLQR) for geometry-aware optimization
- LLQR learns structured inverse preconditioners under a global layerwise optimal-control objective
- Establishes equivalence between steepest-descent steps and finite-horizon LQR problem
- Covers Newton, Gauss-Newton, Fisher/natural-gradient, and intermediate-layer metrics
- Exposes layerwise dynamics and cost matrices encoding dense geometry
- Scalable relaxation preserves cross-layer interactions
- Addresses limitations of K-FAC and Shampoo preconditioners
- Paper available on arXiv with ID 2605.04230
Entities
Institutions
- arXiv