Adam Optimizer Modeled as Rod Flow at Edge of Stability
Researchers have adapted the rod flow model, which was initially created for gradient descent, to include the Adam optimizer along with seven additional momentum-based optimizers. This model conceptualizes successive iterates as an elongated one-dimensional entity, allowing for a continuous-time representation of optimization behavior near stability limits. The study functions within a combined phase space of parameters and the first moment, with the second moment acting as a smooth auxiliary variable. The eight optimizers analyzed consist of heavy ball momentum, Nesterov momentum, and both scalar and per-component variants of RMSProp, Adam, and NAdam. Empirical tests on typical machine learning frameworks demonstrate that rod flow effectively monitors discrete iterates within the edge-of-stability zone. The research is published on arXiv, reference 2605.06821.
Key facts
- Rod flow model extended to Adam and seven other optimizers.
- Model treats consecutive iterates as a one-dimensional rod.
- Operates in joint phase space of parameters and first moment.
- Second moment treated as smooth auxiliary variable.
- Optimizers covered: heavy ball, Nesterov, RMSProp, Adam, NAdam.
- Empirical evaluation on representative ML architectures.
- Paper available on arXiv:2605.06821.
- Builds on prior work by Cohen et al. and Regis et al.
Entities
—