Low-Rank Bandits with Subspace Drift: Tight Bounds

other · 2026-05-22

A recent study published on arXiv (2605.20269) examines low-rank linear contextual bandits characterized by a non-stationary latent subspace that shifts at unknown segment boundaries. The researchers demonstrate precise identification and regret bounds, indicating that the recovery of the subspace necessitates three probing conditions: known noise variance, limited state-noise coupling, and comprehensive probe support. Furthermore, they derive a minimax lower bound of Ω(r√(KT)) and introduce an algorithm that attains Õ(r√(KT)) regret, which aligns with the lower bound, aside from logarithmic factors.

Key facts

Paper arXiv:2605.20269
Studies piecewise-stationary low-rank linear contextual bandits
Rewards live on a low-dimensional latent subspace that drifts
Three necessary conditions for subspace identification
Minimax lower bound Ω(r√(KT))
Algorithm achieves Õ(r√(KT)) regret
Tight bounds along three axes
Single-play scalar rewards

Low-Rank Bandits with Subspace Drift: Tight Bounds

Key facts

Entities

Institutions

Sources