Gradient descent dynamics in low-rank RNNs reveal hidden learning structure
A recent theoretical study published on arXiv expands the low-rank framework to encompass learning processes in recurrent neural networks. The researchers derive the dynamics of gradient descent within a reduced overlap space, establishing a closed-form system of ordinary differential equations (ODEs) that precisely describes learning for linear RNNs and approaches accuracy for nonlinear RNNs in the large-N Gaussian limit. They differentiate between loss-visible overlaps, which influence network performance and output, and loss-invisible overlaps, which, while not impacting functionality, are essential for characterizing the learning process. This research enhances the theoretical comprehension of learning in low-rank RNNs, connecting network connectivity to its functional outcomes.
Key facts
- Paper published on arXiv with ID 2605.04115
- Extends low-rank framework from activity to learning
- Derives gradient-descent dynamics in reduced overlap space
- Formulates closed-form ODEs for learning in low-rank RNNs
- Exact for linear RNNs, asymptotically exact for nonlinear RNNs in large-N Gaussian limit
- Distinguishes loss-visible and loss-invisible overlaps
- Loss-visible overlaps determine network activity, output, and loss
- Loss-invisible overlaps do not affect function but are required to describe learning
Entities
Institutions
- arXiv