arXiv Preprint Introduces Natural Gradient Descent with Momentum for Nonlinear Manifold Optimization
A preprint available on arXiv (ID: 2604.15554v1) discusses the use of natural gradient descent (NGD) with momentum for function approximation via nonlinear manifolds. The research focuses on optimization scenarios where functions are represented by elements from differentiable parametrized manifolds, including neural networks with differentiable activation functions and tensor networks. NGD is defined as a preconditioned gradient descent approach that updates parameters based on a functional viewpoint. Unlike the Hessian, it utilizes the Gram matrix from the tangent space's generating system at the current iteration, leading to locally optimal updates in function space by projecting gradients onto the manifold's tangent space. The study highlights that both gradient and natural gradient descent methods can get stuck in local minima, particularly when dealing with nonlinear manifolds or complex optimization landscapes. This announcement is categorized as cross, emphasizing its interdisciplinary significance. The findings enhance optimization methodologies in machine learning and computational mathematics.
Key facts
- arXiv preprint ID: 2604.15554v1
- Announcement type: cross
- Focuses on natural gradient descent (NGD) with momentum
- Addresses function approximation using nonlinear manifolds
- Examples include neural networks with differentiable activation functions and tensor networks
- NGD uses Gram matrix of tangent space generating system instead of Hessian
- Both gradient and NGD methods can get stuck in local minima
- Research relevant to optimization in machine learning and computational mathematics
Entities
Institutions
- arXiv