ARTFEED — Contemporary Art Intelligence

AdaFRUGAL: Dynamic Memory Optimization for LLM Training

ai-technology · 2026-04-30

AdaFRUGAL introduces dynamic controls for memory-efficient training of large language models, automating hyperparameter tuning that previously required manual intervention. The method extends the FRUGAL framework by incorporating a linear decay for the subspace ratio (ρ) and a loss-aware schedule for update frequency (T). Experiments on English C4 and Vietnamese VietVault pre-training datasets, as well as GLUE fine-tuning, show AdaFRUGAL maintains competitive performance against AdamW and static FRUGAL while reducing GPU memory and training time. This offers a practical solution for resource-constrained environments.

Key facts

  • AdaFRUGAL automates hyperparameter tuning for FRUGAL's subspace ratio (ρ) and update frequency (T).
  • Uses linear decay for ρ and loss-aware schedule for T.
  • Tested on English C4 and Vietnamese VietVault pre-training, and GLUE fine-tuning.
  • Maintains competitive performance against AdamW and static FRUGAL.
  • Reduces GPU memory and training time.
  • Targets resource-constrained LLM training.
  • Published on arXiv (2601.11568).
  • Authors not specified in source.

Entities

Institutions

  • arXiv

Sources