ARTFEED — Contemporary Art Intelligence

Gradient-Based Method Optimizes Pretraining Loss Weights Online

other · 2026-05-11

A new gradient-based bilevel method learns pretraining loss weights online by aligning composite gradients with downstream objectives, avoiding multiple backward passes. The approach reduces hyperparameter tuning overhead to ~30% above a single training run. Evaluated on event-sequence modeling and self-supervised computer vision, it matches or improves upon tuned baselines.

Key facts

  • Proposes gradient-based bilevel method for online loss weight learning
  • Aligns composite pretraining gradient with downstream objective
  • Avoids multiple backward passes via loss structure exploitation
  • Reduces hyperparameter tuning overhead to ~30% above single run
  • Evaluated on event-sequence modeling and self-supervised computer vision
  • Matches or improves upon carefully tuned baselines

Entities

Sources