AdaFRUGAL: Dynamic Memory Optimization for LLM Training

ai-technology · 2026-04-30

AdaFRUGAL introduces dynamic controls for memory-efficient training of large language models, automating hyperparameter tuning that previously required manual intervention. The method extends the FRUGAL framework by incorporating a linear decay for the subspace ratio (ρ) and a loss-aware schedule for update frequency (T). Experiments on English C4 and Vietnamese VietVault pre-training datasets, as well as GLUE fine-tuning, show AdaFRUGAL maintains competitive performance against AdamW and static FRUGAL while reducing GPU memory and training time. This offers a practical solution for resource-constrained environments.

Key facts

AdaFRUGAL automates hyperparameter tuning for FRUGAL's subspace ratio (ρ) and update frequency (T).
Uses linear decay for ρ and loss-aware schedule for T.
Tested on English C4 and Vietnamese VietVault pre-training, and GLUE fine-tuning.
Maintains competitive performance against AdamW and static FRUGAL.
Reduces GPU memory and training time.
Targets resource-constrained LLM training.
Published on arXiv (2601.11568).
Authors not specified in source.

AdaFRUGAL: Dynamic Memory Optimization for LLM Training

Key facts

Entities

Institutions

Sources