ARTFEED — Contemporary Art Intelligence

New AI Framework Improves In-Context Object Localization Without Category Supervision

ai-technology · 2026-06-01

A research paper introduces a two-stage training framework for in-context object localization (ICL) that operates without category supervision. The method explicitly optimizes attention between support bounding boxes and query images using reinforcement learning, addressing limitations of existing vision-language models that rely on category labels and introduce bias. The approach aims to enable category-agnostic, visually grounded localization for applications like image editing and personalized search. The paper is available on arXiv under ID 2605.31145.

Key facts

  • In-context localization (ICL) localizes a target object from support examples in a query image without training or parameter updates.
  • Existing methods require explicit category supervision, limiting applicability to unnamed or instance-specific objects.
  • The new framework uses a two-stage training process to optimize in-context attention without category labels.
  • Reinforcement learning further refines localization performance.
  • The approach targets applications such as image editing, personalized visual search, and retrieval.
  • The paper is published on arXiv with ID 2605.31145.

Entities

Institutions

  • arXiv

Sources