ARTFEED — Contemporary Art Intelligence

TAVIS Benchmark for Active Vision in Imitation Learning

ai-technology · 2026-05-11

A new benchmark named TAVIS has been launched by researchers to assess active vision within imitation learning. Active vision allows a policy to direct its gaze while performing tasks, demonstrating advantages but lacking a unified evaluation framework. TAVIS features two sets of tasks: TAVIS-Head, which includes five tasks utilizing pan/tilt necks for global search, and TAVIS-Hands, comprising three tasks that employ wrist cameras for local occlusion. These are implemented on two humanoid torso models, GR1T2 and Reachy2, developed in IsaacLab. Additionally, it offers three evaluation components: a headcam versus fixed cam comparison, the GALT (Gaze-Action Lead Time) metric for predictive gaze, and procedural ID/OOD divisions.

Key facts

  • TAVIS is a benchmark for active-vision imitation learning.
  • It includes TAVIS-Head (5 tasks) and TAVIS-Hands (3 tasks).
  • Embodiments used: GR1T2 and Reachy2.
  • Built on IsaacLab.
  • GALT metric quantifies anticipatory gaze.
  • Paired headcam-vs-fixedcam protocol is included.
  • Procedural ID/OOD splits are provided.
  • Active vision has emerged as key capability in imitation learning.

Entities

Sources