ARTFEED — Contemporary Art Intelligence

RELO: Reinforcement Learning Improves Visual Object Tracking

ai-technology · 2026-05-11

Researchers have introduced RELO, a reinforcement-learning-based method for visual object tracking that replaces traditional handcrafted spatial priors with a learned localization policy. The method formulates target localization as a Markov decision process, using rewards that combine frame-level intersection over union (IoU) and sequence-level area under the success curve (AUC). A layer-aligned temporal token propagation module enhances semantic consistency across frames with negligible computational overhead. On the LaSOText benchmark, RELO achieves 57.5% AUC without template updates, outperforming prior methods. The approach directly optimizes tracking metrics, addressing the misalignment between surrogate supervision and actual evaluation criteria.

Key facts

  • RELO replaces handcrafted spatial priors with a learned localization policy via reinforcement learning.
  • Target localization is formulated as a Markov decision process.
  • Rewards combine frame-level IoU and sequence-level AUC.
  • Layer-aligned temporal token propagation improves semantic consistency across frames.
  • Achieves 57.5% AUC on LaSOText without template updates.
  • Method addresses misalignment between surrogate supervision and tracking metrics.

Entities

Sources