PILOT: Adaptive Optimizer Adjusts Update Rules During Training

ai-technology · 2026-05-26

Introducing PILOT (Policy-Informed Learned OpTimizer), a novel optimizer that modifies its update strategy during deep learning training by assessing gradient-direction agreement. Unlike traditional static optimizers, which maintain a predetermined functional form, PILOT leverages this agreement as an indicator of local training stability. This enables it to switch between momentum, normalization, and sign-based updates in response to stable, noisy, or inconsistent gradients. Testing on FashionMNIST and CIFAR-10 demonstrates that PILOT consistently outperforms other optimizers in terms of accuracy across convolutional networks. This innovative method overcomes the limitations of static optimizers that fail to adapt to the dynamic nature of gradient behavior throughout the loss landscape.

Key facts

PILOT stands for Policy-Informed Learned OpTimizer
It adapts update behavior during training based on gradient-direction agreement
Gradient-direction agreement signals local training stability
The optimizer adjusts when gradients become stable, noisy, or inconsistent
Tested on FashionMNIST and CIFAR-10 datasets
Achieved highest accuracy among evaluated optimizers on convolutional networks
Static optimizers have fixed functional form before training begins
Training may shift between stable, noisy, and inconsistent regimes

Entities

—

Sources

arXiv cs.AI — 2026-05-26